# **Data Collection in Fragile States** Innovations from Africa and Beyond

# Data Collection in Fragile States

# Johannes Hoogeveen · Utz Pape Editors Data Collection in Fragile States

Innovations from Africa and Beyond

*Editors* Johannes Hoogeveen World Bank Washington, DC, USA

Utz Pape World Bank Washington, DC, USA

#### ISBN 978-3-030-25119-2 ISBN 978-3-030-25120-8 (eBook) https://doi.org/10.1007/978-3-030-25120-8

© International Bank for Reconstruction and Development/Te World Bank 2020. Tis book is an open access publication.

Te opinions expressed in this publication are those of the authors/editors and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis book is licensed under the terms of the Creative Commons Attribution 3.0 IGO License (https://creativecommons.org/licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons licence and indicate if changes were made.

Te use of the International Bank for Reconstruction and Development/Te World Bank's name, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written licence agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO licence. Note that the link provided above includes additional terms and conditions of the licence.

Te images or other third party material in this book are included in the book's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Te use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specifc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

Te publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Te publisher remains neutral with regard to jurisdictional claims in published maps and institutional afliations.

Tis Palgrave Macmillan imprint is published by the registered company Springer Nature Switzerland AG Te registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

# **Foreword**

Te world is becoming less safe and peaceful. According to the 2018 Global Peace Index prepared by the Institute for Economics and Peace, 42 countries experienced an increase in the intensity of internal confict over the past decade, twice the number of countries that have improved. While progress is being made in certain areas—military spending declined slightly, for instance—peacefulness deteriorated as the intensity of confict worsened.

Confict has major costs, in terms of lives prematurely ended, human sufering and forgone development and economic opportunities. A civil war costs a medium-sized developing country the equivalent of 30 years of GDP growth; it takes 20 years for its trade levels to return to pre-war levels. To mitigate the long-term consequences of confict on growth and poverty reduction, the World Bank Group is paying increasing attention to countries afected by confict and violence. Since 2017, the World Bank Group has doubled its fnancial support for countries facing current or rising risks of fragility, opened special windows for assistance to refugees and host communities, and developed new fnancial instruments to support crisis preparedness and response.

For fnancing to be efective, a good understanding of the situation is essential. Without timely and reliable data, development interventions risk being based on anecdotal evidence, with all the risks that come with inadequate planning, poor designs, and inefective targeting. Quality data are critical for development interventions to be efective but are hard to obtain in situations of violence and confict. Worse, collecting good data is rarely a priority in situations where urgency trumps being deliberate.

Tis book ofers a welcome reprieve from this habit. Te authors care about collecting statistical information and have gone to great lengths to compile data in some of the world's most challenging circumstances. Tat they succeeded speaks to their tenacity and ability to think outside the box. Te variety of approaches and solutions discussed means that many practitioners will fnd something of value in "Data Collection in Fragile Situations." Te book efectively eliminates the notion that data cannot be collected in certain difcult circumstances. In doing so, it shifts the paradigm from "there are no data" to "how do we go about collecting data here?"

Te innovations presented in this book are relevant beyond fragile situations, and the Poverty and Equity Global Practice I lead has started to apply approaches discussed here in other contexts. We are exploring the use of mobile phone surveys and permanent enumerators to strengthen statistical data collection for remote locations, many of which are small island states threatened by climate change. We are testing approaches to ask sensitive questions, for example to obtain better information about the occurrence of gender-based violence in World Bank projects. More generally, the innovations described in this book allow us to be more imaginative in creating feedback loops and introducing systematic learning in the World Bank's portfolio of projects.

Tese are just some of the ways in which the Poverty and Equity Global Practice is internalizing the innovations presented in this book. I am convinced that others too will fnd inspiration here. For readers who would like to know more, I urge them to contact the authors of the chapters directly. Tey will be more than happy to ofer additional details or assistance. Contact details for all authors can be found in the contributor section.

Washington, DC, USA Carolina Sánchez-Páramo Senior Director, Poverty and Equity Global Practice

# **Acknowledgements**

Tis book benefted from the generous support of the Belgian TF 0A2158, the SPF TF, as well as the support of the management of the Poverty and Equity Global Practice. Feedback from participants at the 2018 Fragility Forum convinced us of the interest in this book. Hannah McNeish edited the document. Te very constructive feedback from Paul Bance, Kathleen Beegle, Bernard Harborne, and Christina Wieser is gratefully acknowledged.

# **Contents**




# **Notes on Contributors**

**Ana Aguilera** works as an Urban Development Specialist in the Latin America and Caribbean Region. Her work focuses on improving city management with an emphasis on urban economics and spatial development. Her work also comprises survey management and design to measure living standards and socioeconomic indicators in countries such as Tanzania, South Africa, Sierra Leone, Lebanon, Jordan and the Kurdistan region. Ana has contributed to various World Bank's Urbanization Reviews, including Ethiopia, Nigeria, Turkey and Central America. In 2014 she was awarded with the Youth Innovation Fund for her work using Big Data to understand mobility patterns in cities. Prior to the World Bank, Ana worked as an Economist at CAF's Direction of Public Policy and Competitiveness. She has also worked in the public and private sectors in Latin America and the United States, as well as on policy consulting advising local and regional governments. Ana graduated as an Economist from Universidad Católica Andrés Bello in Caracas, and holds a M.Sc. in Public Policy from Te University of Chicago.

**Mohamed Coulibaly** is a consultant in the Poverty and Equity Global Practice of the World Bank. He graduated from the National School of Statistics and Applied Economics (ENSEA) as an engineer in statistics and economics, he started his career at Bloomfeld Investment Corporation gaining experience on country and sector risk assessment and public debt rating. Before joining the World Bank, Mohamed was a research ofcer at the Cabinet of the Minister of Planning and Development in Cote d'Ivoire. His current work focuses on evaluating local development, harmonizing household survey data in Sub-Saharan Africa, and assessing fscal policy impact on poverty and inequality.

**Stephanie Eckman** is a fellow at RTI International in Washington, DC specializing in methods to collect high quality survey data. Her research focuses on the combination of survey and geospatial data. Previously, she held teaching and research positions at the Institute for Employment Research in Nuremberg, Germany and at the University of Mannheim. Dr. Eckman received a Ph.D. in survey methodology from the University of Maryland.

**Alvin Etang** is a senior economist in the Poverty and Equity Global Practice at the World Bank. Before joining the World Bank, he was a postdoctoral associate at Yale University. His interest in micro-development has led him to focus on analysis of poverty and welfare outcomes. With substantial experience in household survey design and implementation, Alvin has worked in several African countries. He is currently managing the World Bank's "Listening to Africa" initiative, mobile phone panel surveys for welfare monitoring, which has won many awards including for innovation and knowledge. He has also taught undergraduate economics courses, and has designed and used economic experiments as a tool to analyze poverty issues. His research has been published in several academic journals and has also featured in popular press such as *Te Economist*, *Wall Street Journal*, *Financial Times*, *Te Atlantic*, *Frontline*, among others. He is a co-author of the book titled *Mobile Phone Panel Surveys in Developing Countries: A Practical Guide for Microdata Collection*. He holds a Ph.D. in economics from the University of Otago in New Zealand.

**Saad Gulzar** is an Assistant Professor of Political Science at Stanford University. He uses feld experiments and data from government programs to study the determinants of political and bureaucratic efort toward citizen welfare. His research interests lie in the political economy of development and comparative politics, with a regional focus on South Asia. Gulzar earned his Ph.D. from New York University in 2017.

**Kristen Himelein** is a senior economist/statistician in the Poverty and Equity Global Practice at the World Bank, with extensive experience working in fragile, climate-afected, and post-confict states. Her areas of expertise are survey methodology, sampling, and statistics, and her work has been published in peer-reviewed journals including the *Journal of Development Economics*, *Journal of Ofcial Statistics*, and *Statistical Journal of the International Association for Ofcial Statistics*, among others. She was also the project lead for high frequency cell phone surveys to measure the socio-economic impacts of Ebola in Sierra Leone and Liberia, which were widely disseminated in the international press. She holds a Master of Public Administration in International Development degree from the Harvard Kennedy School, and a graduate certifcate in survey sampling from the Joint Program on Survey Methodology at the University of Maryland.

**Johannes Hoogeveen** is a lead economist in the Poverty and Equity Global Practice at the World Bank. He combines analytical and strategic work with the implementation of lending operations. He published academic papers on poverty measurement, survey design, statistics governance, education, nutrition, informal insurance, and land reform. His current research interest evolves around creating feedback loops (particularly in fragile situations exploiting new and established data collection technologies) and the relation between poverty, governance and identity. He was a manager at Twaweza, a national NGO in Tanzania, where he led a unit strengthening citizen accountability through feedback mechanisms. He holds a Ph.D. in economics from the Free University in Amsterdam.

**Mohammad Isaqzadeh** is a Ph.D. candidate at Princeton University. He has over seven years of experience as a consultant for the World Bank, working on the impact evaluation of NSP, NERAP, UCT and TUP programs in Afghanistan. He also taught for fve years at the American University of Afghanistan. His research focuses on insurgencies, post-confict governance, and the role of religion in political mobilization and public goods provision. He has co-authored *Policing Afghanistan: Te Politics of the Lame Leviathan* (Oxford University Press), and "Violence and Risk Preference: Experimental Evidence from Afghanistan" (American Economic Review). He holds a master's degree in international development from Oxford University.

**Lennart Kaplan** is a researcher at Göttingen and Heidelberg University. As a member of the research group "Globalization and Development" Lennart focuses on the meso-level of development research. More specifcally, his research combines impact evaluation methods with geospatial and survey approaches.

**Roy Katayama** is a senior economist in the Poverty and Equity Global Practice at the World Bank. His current work focuses on the design of data collection methods suitable for fragile settings, performance-based fnancing for statistical capacity building, iterative benefciary monitoring for improved project implementation, enhanced digital census cartography, geospatial analysis of development, and global poverty monitoring. During his time at the World Bank, he has led analytical work on poverty and inequality, poverty measurement, poverty maps, welfare impact of shocks, targeting of social safety nets, and systematic country diagnostics. He has experience working in numerous Sub-Saharan African countries. He holds a Master of Public Administration in International Development from the Harvard Kennedy School of Government.

**Nandini Krishnan** is a senior economist in the Poverty and Equity Global Practice of the World Bank, currently leading its Afghanistan program. In the past, she has worked as the poverty economist in Iraq and the Philippines, co-led a multi-country survey and analysis of host communities and Syrian refugees, and has supported regional and corporate initiatives for data and monitoring. She has worked on labor market, gender and inclusion issues in Egypt, Jordan, the Palestinian territories, Yemen, and the MENA region, and supported impact evaluations of large scale projects and programs in Tanzania, Nigeria, and South Africa. As a member of the World Bank Research Group's Social Observatory Initiative, she supports World Bank operations to design systems that can learn from implementation data to improve efectiveness, and adapt program design. She holds a Ph.D. in Economics from Boston University.

**Johan Mistiaen** a Belgian national, joined the World Bank in 1999 and is currently the Program Leader and Lead Economist for Eritrea, Kenya, Rwanda and Uganda. He is based in Nairobi where he coordinates and supports the Bank's team responsible for delivering the analytical and operational portfolio managed by the Equitable Growth, Finance and Institutions group of Global Practices. He previously led the Bank's socio-economic and demographic data team and worked in the Bank's Research Department for some years. Johan studied Biology, Economics and Statistics at the Universities of York (UK) and Maryland (USA).

**Juan Muñoz** is the founder and managing partner of Sistemas Integrales, a frm created in 1970 and based in Santiago, Chile. He is interested in the application of statistics and computer science to economics, health, education and agriculture. As a consultant for universities, governments, international agencies and private clients, he has assisted in the design, implementation, steering and analysis of censuses and agriculture, budget, consumption, demographic, living standard, labor, and opinion surveys in over a hundred countries. Tese projects usually entail sampling and questionnaire design, survey organization and logistics, integration of computers to feldwork, quality monitoring, report generation, and database documentation and dissemination.

**Utz Pape** is a senior economist in the Poverty and Equity Global Practice at the World Bank. He leads teams to design and implement lending projects to improve national statistical systems and to prepare analytical poverty work including poverty assessments, poverty impact studies, and Systematic Country Diagnostics. His work experience in post-confict countries contributes to his research agenda including the design of methodologies for poverty measurement in fragile settings. His research has received awards and is published in peer-reviewed journals, including *Nature*. He holds a Ph.D. from the International Max Planck Research School and the Free University of Berlin and was a postdoctoral associate at Harvard University. He also holds a Master of Public Administration/International Development from the London School of Economics and the School for International and Public Afairs (SIPA) at Columbia University.

**Flavio Russo Riva** is a Ph.D. candidate in Government and Public Administration at the São Paulo School of Administration. His research focuses on impact evaluation of public policies and social programs in Brazil's public education and health systems using observational data and randomized controlled trials. He has worked as a short-term consultant for the Poverty and Equity Global Practice at the World Bank and the Inter-American Development Bank in the last years.

**Jacob Shapiro** is a Professor of Politics and International Afairs at Princeton University and directs the Empirical Studies of Confict Project, a multi-university consortium that compiles and analyzes micro-level data and other information on politically motivated violence in countries around the world. He is author of *Te Terrorist's Dilemma: Managing Violent Covert Organizations* and co-author of *Small Wars, Big Data: Te Information Revolution in Modern Confict*. His research has been published in broad range of journals in economics and political science as well as a number of edited volumes. He has conducted feld research and large-scale policy evaluations in Afghanistan, Colombia, India, and Pakistan. Shapiro received the 2016 Karl Deutsch Award from the International Studies Association, given to a scholar younger than 40 or within 10 years of earning a Ph.D. who has made the most signifcant contribution to the study of international relations.

**Dhiraj Sharma** is an economist in the Poverty and Equity Global Practice. His work focuses on welfare measurement, poverty diagnostics, and policy analysis. He has led or contributed to the analysis of poverty in Ghana, Iraq, and Nepal, and has led impact evaluations in Nepal. His current work focuses on welfare analysis and statistical capacity building in countries in the Middle East and North Africa region. His recent work in the region includes research on the impact of refugee infux on host communities. He is a co-author of the *Poverty and Shared*  *Prosperity 2018: Piecing Together the Poverty Puzzle*, the World Bank's biennial publication on global extreme poverty. Dhiraj holds a Ph.D. in applied economics from the Ohio State University.

**Andre-Marie Taptué** is an economist in the Poverty and Equity Global Practice at the World Bank. He developed and implemented the Benefciary Monitoring (IBM) System. He has also led the Permanent Monitoring System in North Mali and is supporting implementation of the Tird-Party Monitoring in Mali. He is currently working on analytical work, a statistical project, policy dialogue, and extending IBM. Prior to joining the Bank, Andre-Marie was a lecturer at Laval University in Canada. He also worked as an economist statistician at the Department of Studies and Statistical Surveys of the National Institute of Statistics in Cameroon. He earned a Ph.D. in economics at Laval University and a master's degree in statistics and economics at ISSEA in Cameroon.

**Tara Vishwanath** is currently a lead economist in the Europe and Central Asia region's Poverty and Equity Practice of the World Bank and Global Lead on Welfare Implications of Climate, Fragility and Confict Risks. She has led numerous analytical products on poverty, inequality and employment in countries in the South Asia and Middle East and North Africa; more recently co-leading the multi-topic survey and analysis of Syrian refugee and host communities in Lebanon, Jordan and Northern Iraq. Before joining the World Bank, she was a Professor in the Department of Economics at Northwestern University and has published widely in refereed economics journals spanning research topics in economic theory, labor economics and development. She holds a Ph.D. in Economics from Cornell University.

**James Walsh** is a member of the World Bank's Behavioral Science Unit, eMBeD, and a doctoral student at the Blavatnik School of Government at the University of Oxford. He was a member of the research team for the World Development Report 2015: Mind, Society, and Behavior and served on the faculty of the Georgetown School of Foreign Service where he lectured in behavioral approaches to development economics. He holds a B.A. in Economics and Political Science from Trinity College Dublin and a Master of Public Policy from the Kennedy School of Government at Harvard University.

**Gervais Chamberlin Yama** is a statistician in the Poverty and Equity Global Practice at the World Bank, with experience in working in fragile and confict-aficted states. He has extensive experience in designing executing and managing surveys in the Central African Republic, Democratic Republic of Congo, and the Republic of Congo. He has recently developed a new approach to performance-based data collection for enumerators and supervisors in the Central African Republic that enhances data quality and promotes efciency. He holds a master's degree in statistics from the Sub-Regional Institute of Statistics and Applied Economics (ISSEA) in Yaoundé, Cameroon.

# **List of Figures**



## **Chapter 3**




## **Chapter 5**



**xxvi List of Figures**


### **Chapter 7**


## **Chapter 9**





# **List of Tables**


**xxix**


# **List of Boxes**



# **1**

# **Fragility and Innovations in Data Collection**

**Johannes Hoogeveen and Utz Pape**

# **1 Introduction**

Fragility, confict, and violence (FCV) represent a critical development challenge that threatens eforts to end extreme poverty and promote shared prosperity. Two billion people live in countries where development outcomes are afected by FCV, including many countries in Africa. Of the 38 countries on the World Bank's ofcial 2018 FCV list, 20 can be found in Africa. Moreover, while the global share of the extreme poor living in confict-afected situations is about 20%, this number is much higher in Africa, around 32%. In fact, nearly 80% of all poor people living in confict-afected situations reside in Africa (Fig. 1).

World Bank, Washington, DC, USA e-mail: jhoogeveen@worldbank.org

## U. Pape e-mail: upape@worldbank.org

J. Hoogeveen (\*) · U. Pape

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_1

**Fig. 1** Extreme poverty (2017 or latest available number) (*Source* World Bank, Poverty and Equity Data Portal, accessed November 2017)

Particularly worrisome is that between now and 2030, the share of extremely poor people living in FCV countries is expected to rise from 20 to 50%. Given that most of these people are likely to be in Africa, it is unsurprising that at the 2015 Annual Bank Conference on Africa, Makhtar Diop, the then World Bank's vice president for the region, emphasized not only the importance of fragility, but also the need for a much more profound inquiry into its drivers and consequences: "Confict and fragility exact a costly toll on the economies of Africa. As we scale up our operational work in fragile states, a better understanding of the causes and impacts of confict and fragility can help to prevent some of the deadly conficts at the community level."

A better understanding of socio-economic well-being of citizens in such countries as well as measuring the impacts of shocks and conficts start with better data. Data deprivation is a pressing problem in FCV settings for both decision makers and its citizens, and in particular, for the poor, who often lack voice and agency, and who may remain invisible unless data identify their existence and state of being. Te need for reliable data on living conditions in fragile situations is even greater, and yet data deprivation tends to be worse in such contexts. Data can provide evidence on the plight of some of the most vulnerable populations, such as the displaced, or those afected by natural disasters, violence, famine, or epidemics, and can facilitate the formulation of policy responses by decision makers. As such, there is an urgent need for data in fragile situations.

Tis book attempts to address this data challenge. It refects work carried out by World Bank staf from the Poverty and Equity Global Practice and by others covering our experiences in fragile situations, facing challenges around data collection, mostly in Africa in the Central African Republic, the Democratic Republic of Congo, Liberia, Madagascar, Mali, Malawi, Nigeria, Senegal, Sierra Leone, Somalia, South Sudan but also in Iraq, Jordan, Lebanon, and Yemen.1 Typical welfare surveys such as the Living Standard Measurement Surveys (LSMSs) and Household Budget Surveys (HBSs) that are implemented in a large number of countries are not always appropriate for these situations. Because of the pressing demand for data, there has been signifcant support for experimentation and innovation around data collection methods. Tis has allowed us to develop solutions suitable for these contexts, which are often equally relevant for non-fragile settings.

Trough our experiences in identifying innovative ways to collect data, we have learned three lessons. First, it is possible to collect high-quality data in fragile settings. Doing so may require adaptations to the data collection process but situations in which no information can be collected are rare. Second, data collection in fragile contexts does not need to be more expensive than in other settings. In fact, the costs associated with many of the innovations discussed in this book compare favorably to more traditional data collection methods. Tird, a careful assessment of the data needs of decision makers is essential. Often relatively easy-to-collect information goes a long way toward meeting their demands, as long as it is provided in a timely fashion. Tis holds

<sup>1</sup>Not all these countries are on the fragile country list maintained by the World Bank and downloadable from: http://www.worldbank.org/en/topic/fragilityconfictviolence/brief/harmonized-list-of-fragile-situations. When countries are not on the fragile country list, the discussed approaches were typically applied during an emergency, as was the case during the Ebola crisis in Sierra Leone and the 2016 foods in Malawi.

particularly in volatile situations. Hence it may be sufcient to demonstrate whether respondents can engage in certain income-generating activities, without measuring how much income is actually earned. Perception questions, eliciting information about trust, security, or development priorities, tend to be very informative for decision makers in unstable settings where rumors spread quickly and where opinion polls and (objective) media reporting are absent. In other instances, simple to collect information does not sufce. We present such a case in Chapter 9 for Somalia where estimates of poverty had to be produced even though interviews could not be lengthy for security reasons, precluding asking detailed consumption questions.

Tere was also a fourth lesson: technology is not a panacea for all data collection issues and not everything works. We considered machine learning and big data, but these approaches were not successful. Cloud computing and improvement of statistical learning algorithms enable the use of satellite images and other sources of big data, but satellite images can be expensive, the methodologies can be complex, and external validity is at times difcult to ensure. Some data collection exercises were discontinued because of a lack of funding (and by implication, a lack of demand). Tablets facilitated electronic data collection and reduced feld supervision, but in some situations, its use complicated data collection as it raised suspicion from respondents or unwanted attention from thieves. Improved mobile phone coverage also created the opportunity to use mobile phone interviews for data collection in insecure areas, but the resulting information may not be representative of the population.

It has been immensely rewarding to fnd ways to produce reliable data in the face of signifcant challenges: absent sampling frames, high levels of insecurity, and limited budgets. We feel privileged to have been given the opportunity to collect data that has helped inform decision makers at critical junctures of the development process. However, we also realize that our work is far from complete. With adaptations, many of the innovations presented in this book are scalable. Tis holds for district censuses, which are highly suited to inform decentralization processes, or Iterative Benefciary Monitoring (IBM), which can be used to improve project performance in any context. Rapid consumption surveys have the potential to signifcantly reduce the cost of collecting consumption data, and sampling frames derived from satellite images can be used more systematically to update sampling frames. Moreover, with cell phone coverage continuously improving, mobile phone surveys (examples presented in this book are monitoring the Ebola crisis and people displaced by the crisis in Mali, and to inform a famine response in Nigeria, Somalia, South Sudan, and Yemen) that can be scaled up rapidly during a crisis deserve to become part of the regular tool-box of disaster planning, as they can ofer timely data when a crisis is imminent.

## **2 Data Collection in Fragile Situations**

Fragility, conficts and violence afect data collection in multiple ways. Te capacity to implement and analyze complex surveys tends to be limited and resources to pay for data collection are scarce as the revenue generating capacity in FCV settings tends to be constrained and because funding for data collection competes with other urgent needs. For these reasons, few household surveys are implemented in fragile situations, or if they are, are not implemented regularly or without covering the entire territory. In addition, risks in FCV countries are oftentimes elevated, because of violence but also because of other dangers, such as disease. In Somalia, for instance, a traditional household consumption survey with interview lengths exceeding several hours was not possible given the level of insecurity and danger imposed to enumerators if spending more than one hour with a household. During the Ebola crisis, enumerators could not travel and collect information from respondents using face-to-face interviews because of the risk of infection.

Data collection during confict is also afected by poor road quality, inadequate telecommunications infrastructure and, at times, populations that are hostile to representatives of the central government ofering little in terms of key public services. Te reason for these challenges is because conficts tend to occur in locations that are physically distant from administrative centers, isolated, have low population density and few key public services, and which bear the brunt of weak state capacity. Collecting data in such situations is not only logistically challenging, but people living in these areas often feel little loyalty to the distant capitals that have historically ignored them and may be hostile to anyone seen to represent the state.

Mobile target populations are a further complication often associated with data collection in fragile situations. Mobility is a challenge not only because pastoralists tend to live in distant, low-density areas that are often the theaters of confict, but also because displacement is a major issue during times of insecurity. During the crisis in northern Mali, for example, 36% of the population fed the area, and in the Central African Republic, 25% of the population was displaced. Te United Nationals High Commissioner for Refugees (UNHCR) estimated that by the end of 2016, there were 5.1 million refugees in Africa, with the Central African Republic, the DRC, Somalia, South Sudan, and Sudan being the major sources of refugees. Te number of internally displaced people (IDPs) is even higher, with almost 9 million displaced people between these fve countries alone.

Data collection in FCV settings is also afected by the absence of adequate sampling frames, which may have been lost or are simply out of date. In the case of the Central African Republic, for instance, during the civil war, much of the data infrastructure (buildings, books, maps, servers, and computers) was lost to looting. However, even without the looting, sampling frames would no longer have been valid as a large proportion of the population had become displaced. Finally, there is often time pressure, as decision makers require accurate information with a quick turnaround. In the Central African Republic, following the signing of the Peace Accord, the team had 90 days to prepare, feld, and analyze a survey to yield representative data on the development priorities of citizens. Te pressure to inform decision makers during or directly after a disaster can be even higher, for example, in the Ebolaafected countries, or for the drought response in Nigeria, Somalia, South Sudan, and Yemen.

Because traditional data collection methods are not always suited to fragile situations, this book presents innovations developed to deal with some of these challenges. Some, though not all, were also motivated by the fact that data needs in fragile situations are diferent. Tere is much more emphasis on timely data that can monitor a given situation than on in-depth analyses to inform policy decisions. For example, policymakers in insecure settings often prefer knowing where schools are and whether they are still functioning, rather than seeing a detailed analysis of whether the rate of return to education is higher at the primary or tertiary level. Tis reality has shaped some of the data collection processes presented in this book, as questionnaires these contexts can be less comprehensive. Tis in turn can be efectively combined with mobile phone interviews as a data collection method, which typically should not last longer than 20–30 minutes, and interviews by locally resident enumerators who cannot be retrained for every new questionnaire. District surveys introduced in the Central African Republic and Mali capitalized on the realization that an index refecting the degree of public service provision (health, water, education, and infrastructure) at the lowest administrative level was a pragmatic alternative to a more detailed poverty map, which would take a long time to create. Te IBM approach introduced in Mali which ofers feedback to project staf drawn from light data collection exercises, was developed to complement project supervision missions, which had become difcult to conduct due to security concerns. Te approach relies on highly simplifed data collection tools, which ensure focus, speed and allow to keep cost down.

Simplifcations, are not always possible. In Somalia, for instance, up-to-date poverty estimates were needed to inform the Heavily Indebted Poor Countries (HIPC) process. Under normal circumstances, estimating poverty requires administering a lengthy consumption module that takes several hours to complete. However, due to security concerns, it was advised that the maximum duration of a household interview should not exceed 60 minutes. Tis time restriction meant that a lengthy consumption module was not possible, even if questions about education, health, and perceptions were dropped. Using a new questionnaire design with smart sampling techniques at the level of questions solved this challenge.

To structure the book, we organized it into three parts. Part I: "Innovations in Data Collection" presents ways to collect data that are cognizant of security and other risks, as well as the specifc data needs of decision makers in FCV countries. Te frst three chapters in this section discuss data collection using mobile phone interviews. Chapter 2 provides an example of this method during the Ebola crisis in Sierra Leone. Chapter 3 describes how mobile phone interviews were used to inform a response to the drought in Nigeria, Somalia, South Sudan, and Yemen. Chapter 4 reports an exercise to track people displaced by the crisis in northern Mali. Chapter 5 discusses how, in situations where travel by outsiders is too dangerous, data collection may still be feasible by relying on locally recruited, resident enumerators who are trusted by their community. Chapter 6 discusses the district survey and Local Development Index introduced in the Central African Republic. It informed the Recovery and Peace Building Assessment and collects much of the data that feeds into the national monitoring system.

Part II: "Methodological Innovations" presents innovations with respect to collecting data and sampling. To deal with the absence of sampling frames in the DRC and Somalia, satellite images and sophisticated machine learning algorithms were used to estimate population density and demarcate enumeration areas (Chapter 7). Te same chapter also showcases a novel sampling approach implemented in the Afar region of Ethiopia to ensure that pastoralists were adequately included. Tis approach was also used in Somalia to avoid listing exercises that were viewed with suspicion by community and authorities. Chapter 8 discusses sampling for representative surveys of displaced populations, using the example of Syrian refugees and host communities in Jordan, Lebanon and Kurdistan, Iraq. Chapter 9 ofers a solution for those interested in collecting poverty estimates for insecure locations in which the time available for face-to-face interviews is too limited to implement lengthy household consumption expenditure surveys that are generally used for measuring poverty. Chapters 10 and 11 discuss how to elicit truthful information from respondents. Chapter 10 focuses on asking questions about sensitive issues such as e.g. loyalty to controversial groups while Chapter 11 deals with how to avoid strategic responses when respondents might expect benefts to be associated with certain answers.

Part III: "Other Innovations" presents a project that used video testimonials (Chapter 12) as a unique and cost-efective way to give external


**1**Topical guide to this book audiences a perspective on the lives of survey respondents. In South Sudan, a web portal was created where one can watch short video testimonials of respondents describing their situation in their own words, which not only provided the necessary context for the quantitative results, but also gave a voice to the poor. In Chapter 13, IBM is discussed, which relies on light-touch, repeated data collection exercises to create dynamic feedback loops for project staf. IBM has been found to enhance the efciency of projects and is, because of its minimalist data demands, highly suited for fragile contexts.

We have aimed to keep this book practical and accessible, focusing on illustrations and applications, as our objective is to provide the reader with examples of what is feasible. Every chapter presents the data challenge, how it was addressed, and lessons learned. For readers interested in specifc topics, we present in Table 1 an overview of which chapters might be of interest. For example, if the concern is that respondents might give biased answers, because questions touch upon sensitive issues or because the respondent may believe that the right responses can result in certain benefts, then Chapters 10 and 11, which discuss methodological solutions and behavioral nudges respectively would be worth reading.

#### **Box 1 Using tablets for data collection allows for a rich array of innovations**

Using tablets or mobile phones to collect data, or more specifcally, Computer-Assisted Personal Interviews (CAPI), led to more changes than making data entry obsolete. Enumerator error can be reduced with dynamic validity checks and complex skipping patterns, opening up new possibilities. The randomization of questions can now be automated, for instance, a feature that has been part of rapid consumption surveys (Chapter 9) and list experiments (Chapter 10). Complex survey skipping patterns, not possible in paper questionnaires, become an additional option.

To improve accuracy, CAPI can identify implausible responses and request enumerators to verify or correct their responses before proceeding. This has proved useful in consumption modules, where responses can be assessed against caloric needs, or where unit values can be checked against plausible price ranges. Photos can also be used to obtain more reliable estimates of otherwise hard to quantify, and seasonably variable units such as a "heap" or "bunch."

The use of tablets also improves supervision. GPS locations can be collected in the background, allowing supervisors to assure that enumerators are where they are expected to be, and also assess the spatial distribution of a sample. Tablets can monitor the time it takes to record answers, and interview snippets can be recorded randomly. These features can quickly confrm whether interviews are actually conducted, reducing the need for unannounced supervision visits.

Enumerators can also take advantage of the additional hardware included in tablets. For panel surveys, households can be given a barcode, which can be photographed or scanned with a tablet, thus reducing the frequency of mistakes. The ability to take pictures and shoot video can be used to enrich feedback in other ways as well. Chapter 12 presents an instance where enumerators were trained to use their tablets to record after the formal interview—stories about the experiences of interviewees.

Where mobile phone networks are available, tablets can send data for aggregation and real-time analysis, signifcantly reducing the time it takes to produce results. As data is typically sent into the "cloud," such analysis can be done anywhere across the globe. The Rapid Emergency Response Survey presented in Chapter 3 made use of this feature. When enumerators are in the feld for a long time, or when questionnaires needed to be updated because errors need to be corrected, the use of tablets allows for remote questionnaire management, a feature used in Chapter 5 to provide resident enumerators with new survey instruments and questions.

Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part I**

# **Innovations in Data Collection**

# **2**

# **Monitoring the Ebola Crisis Using Mobile Phone Surveys**

**Alvin Etang and Kristen Himelein**

# **1 The Data Demand Challenge**

Te outbreak of the Ebola virus disease in West Africa in 2014 constituted one of the gravest global health emergencies of recent years.1 Te Ebola outbreak originated in rural Guinea in December 2013, and then spread across the country and to the neighboring countries of Liberia and Sierra Leone. Te pandemic continued for two years and the World Health Organization (WHO) only declared Liberia free of Ebola in May 2015, Sierra Leone in November 2015, and Guinea in December 2015. By the end of the crisis, the epidemic had claimed more than

A. Etang (\*) · K. Himelein

World Bank, Washington, DC, USA

e-mail: aetangndip@worldbank.org

K. Himelein e-mail: khimelein@worldbank.org

<sup>1</sup>Henceforth, the term Ebola is used to refer to the virus, the disease, or the epidemic outbreak.

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_2

#### **16 A. Etang and K. Himelein**

**Fig. 1** Timing of Sierra Leone and Liberia high-frequency mobile phone surveys (Color fgure online) (*Note* Shading refects dates of data collection. *Source* Authors calculations based on WHO Sit Rep data)

11,300 lives in these three countries, including over 500 frontline healthcare workers.2

In addition to its efects on people's health, Ebola caused widespread economic disruption. At the height of the epidemic, schools, and markets were closed, government workers were placed on furlough, social gatherings were banned, transportation restrictions were placed on people and goods, and international borders were closed. Terefore, in addition to the health monitoring by the WHO, there was an urgent need for just-in-time data in order to monitor the economic impact of Ebola on livelihoods and wellbeing. Given the epidemic, however, it was impossible to deploy enumerators to the feld to collect information from households and communities through face-to-face interviews.

<sup>2</sup>World Bank (2016).

Te solution to this challenge came from the realization that the rapid spread of mobile phone coverage had created possibilities to monitor the crisis through mobile phone interviews. Mobile phones are particularly useful in situations in which data must be collected rapidly, at low cost, and/or in situations where traditional face-to-face interviews are not possible. In Sierra Leone and neighboring Liberia, it allowed for a timely response by providing critical data to decision makers about household welfare at the height of the crisis and during its aftermath (Fig. 1).

## **2 The Innovation**

Te proliferation of mobile phone networks and inexpensive handsets has opened up new possibilities for data collection. Since 2012, the Africa region of the World Bank supports a mobile phone survey initiative called Listening to Africa (L2A). L2A collaborates with statistical agencies and ofers the possibility to complement face-to-face household surveys with mobile data collection.3

Te standard L2A approach starts with a face-to-face household survey that serves as a baseline. Tis baseline survey ensures that the randomly drawn sample is representative of the target population. During this survey, each respondent receives a simple mobile phone and when necessary, a solar charger. Te respondents then receive calls from a call center every month, which conducts the mobile phone interviews. Survey questions are programmed in computer-assisted telephone interview software, allowing questions to be posed, and answers to be simultaneously recorded. Te phone interviews are short so that data can be collected quickly, and respondents do not become overly fatigued. Data, once collected, are made available to the public.

Te L2A approach has been introduced in several countries, including Madagascar, Malawi, Mali, Senegal, Tanzania, and Togo, and the

<sup>3</sup>More information on this approach, including the instruments used, can be found on the L2A website: http://www.worldbank.org/en/programs/listening-to-africa. See also: Johannes Hoogeveen et al. (2014).

#### **18 A. Etang and K. Himelein**

**Fig. 2** Responses on food security issues from various L2A surveys (*Source* Authors' calculations from the Malawi, Madagascar, and Senegal L2A surveys)

L2A team has prepared a handbook documenting its experiences.4,5 A two-minute video explaining the L2A approach can be found on the World Bank's website.6

While typical L2A questionnaires are fxed ahead of time, the instrument is fexible and can adapt to unforeseen needs. In particular, the high-frequency collection was well-suited to monitor food security, and the L2A team was able to respond to the unfolding situations in Malawi, Senegal, and Madagascar (Fig. 2). A sample questionnaire with food security questions that can be used for mobile phone interviews is presented in the annex to this chapter.

When the Ebola crisis began in 2014, the World Bank team had accumulated several years of experience with mobile phone surveys. Building on the L2A model, high-frequency mobile phone interventions were designed to provide rapid monitoring of the socio-economic impacts of

<sup>4</sup>Dabalen et al. (2016).

<sup>5</sup>Available at https://openknowledge.worldbank.org/bitstream/handle/10986/24595/9781464809040. pdf.

<sup>6</sup>http://www.worldbank.org/en/news/video/2017/01/23/listening-to-africa-a-new-way-to-gather-datausing-mobile-phones.

Ebola in Liberia and Sierra Leone.7 As the L2A approach had shown, baseline information was needed to anchor estimates in a representative dataset. Fortunately, there were recent surveys in both countries that could serve this purpose. In Liberia, the Household Income and Expenditure Survey (HIES) was being conducted as the crisis broke out, and was forced to curtail its feldwork in August 2014. Tough only about half of the sample (4075 households) were surveyed, it was nationally representative, and despite not being planned as a panel survey, had collected phone numbers and contact information for respondents. Overall, 57% of HIES households reported a mobile phone number for at least one household member. Tis database of phone numbers and household characteristics became the sample frame for the mobile phone survey sample. In total, fve rounds of phone interviews were completed between October 2014 and March 2015. Data were collected by the Gallup Organization from their US-based call centers, as there was no suitably experienced call center on the ground in Liberia, and it was not possible to bring in international experts due to the travel ban. While using an external call center posed several challenges, including a lack of profciency in local languages, unwillingness of respondents to speak to strangers, and a high costs of calling, the survey was able to conduct 2781 interviews with 1082 unique households over the fve rounds.

In Sierra Leone, the 2014 Labour Force Survey (LFS) was also being carried out during the Ebola crisis, with feldwork completed in July 2014. Te LFS is a nationally representative survey, with a sample size of 4188 households. It was planned as a panel survey, and had therefore collected phone numbers and contact information, with 66% of LFS households reporting a mobile phone number for a least one household member. Using this database, three rounds of data collection were completed between November 2014 and May 2015. Data were collected through a call center at the national statistics bureau, Statistics Sierra Leone, supervised by Innovations for Poverty Action for the frst two rounds and supervised directly by the World Bank for round three. Te survey was able to reach 2111 respondents over the three rounds (Himelein et al. 2015).

<sup>7</sup>A mobile phone survey was also conducted in Guinea but using a diferent methodology (World Bank (2016).

## **3 Results from the Ebola Surveys**

Te Ebola surveys covered a wide range of topics, employment, agriculture, food security and prices, social assistance, remittances, migration, education, and health facility utilization. Te team deliberately avoided asking questions directly related to illness within the household. Such questions were omitted for two reasons: frst, to prevent non-response if households feared the authorities would come to remove ill members and, second, because the nature of the national sample was not well-suited to surveying disease incidences. Te survey also included topics that were kept consistent in every round for monitoring purposes, such as those related to food security and economic activity, and some that were included in only one or two rounds based on the evolving situation. For example, the frst round included questions as to whether the respondent had ever heard of Ebola and what sources of information they had on prevention. In later rounds, questions related to education were added, as schools reopened and social assistance as safety nets projects were rolled out.

Te results from the survey yielded several important fndings related to the economic situation. In both Sierra Leone and Liberia, the surveys found signifcant declines in employment during the crisis, but the efects were not signifcantly higher in places with higher numbers of Ebola cases. Tis indicates an overall economic slowdown caused by the nationwide precautionary measures, particularly the closure of markets, had more of an impact on employment than direct cases of Ebola. Moreover, in both countries, women were more likely to have stopped working during the crisis, and less likely to have returned to work by the end of data collection period. In Sierra Leone, income and labor force participation (hours) for both men and women remained below baseline levels at the end of data collection, although the overall percentage of individuals working had largely rebounded. In addition, many workers had switched sectors during the crisis, generally moving to positions with lower productivity (Fig. 3).

Beyond the fndings related to labor markets, the surveys provided important insights related to prices, food security, coping strategies, education, avoidance of healthcare facilities, and perceptions of public safety and trust in institutions. Te surveys were able to monitor the

**Fig. 3** Evidence from the Sierra Leone and Liberia phone surveys (*Source* Sierra Leone high-frequency mobile phone survey and Liberia high-frequency mobile phone survey)

usage of healthcare facilities for non-Ebola medical care. For example, the percentage of women in Sierra Leone giving birth in the previous two months in a hospital or clinic increased from 28% in November 2014 to 64% in February 2015 to 89% in May 2015. In some cases, these fndings conficted with the anecdotal evidence that had been previously guiding policy. In agriculture, farmers in both countries estimated that the production had declined, but to a lesser extent than had been feared, and with no evidence of the widescale abandonment that had been previously reported. In Sierra Leone, a delay in the arrival of seasonal rains also played a role. In education, once schools reopened, most students returned, 87% in Sierra Leone and 73% in Liberia. Of those that did not, the reason cited was monetary rather than fear of infection.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

Although they cannot replace face-to-face household surveys in all contexts, mobile phone surveys ofer substantial benefts in specifc circumstances and for specifc data collection needs. Advantages include the ability to collect data in volatile and high-risk environments (such as during political crises or epidemics), fexibility and responsivity to new data needs, timeliness, cost efectiveness, and utility for monitoring and impact evaluation. However, this approach remains challenging, and several lessons have been learned.

Te risk of non-response and attrition applies to all panel surveys but is more likely for high-frequency mobile phone panel surveys. In the case of L2A, several strategies were undertaken to minimize these risks. Because sample selection did not consider prior ownership of a mobile phone, some households, particularly the poorest ones, had access to a mobile phone network but did not actually own mobile phones. To overcome this, mobile phones were distributed to all selected households, regardless of whether they already owned one, and respondents received training on various aspects of mobile phone ownership. In addition, the frequent power cuts in survey locations meant that phones could not be recharged, which could then lead to nonparticipation. To address these power cuts, small solar chargers were provided to allow households to charge their phones and receive follow-up calls.

In L2A, respondents were compensated each time they completed a phone interview, receiving a small amount of airtime credit transferred directly to their phones. Tis was both to compensate respondents for their participation, thereby encouraging them to stay involved, and to prevent the cancellation of phone numbers, which is a risk for those who do not 'top up' their phones after a certain period (usually 90 days). Te lag period between the baseline survey and the frst phone interview was also kept short. During the baseline survey, phone numbers were collected for all household members to increase the chances of reaching the respondent, and respondents were asked for their preferred call times. Eforts to track and trace hard-to-reach respondents also continued throughout implementation.

Response rates for the L2A surveys were generally high, refecting the numerous measures taken to minimize non-response. In the Ebola surveys, however, other than providing limited compensation to respondents, it was not possible to take any of the above mitigation strategies. Tis was compounded by low network coverage rates, particularly in rural areas, and led to low response rates and issues with sample representativeness. For those baseline survey households that did not respond in some of all of the cell phone rounds, analysts attempted to mitigate the impact of attrition by adjusting the weighting of the data. Te correct weighting depends on whether cross-sectional or panel analysis is being conducted, and, in the case of panel analysis, which rounds of the survey are being compared. In the Sierra Leone and Liberia mobile phone surveys, multiple sets of weights were necessary depending on the combination of rounds. While the distribution of respondents in the mobile phone survey by age, gender, county, and sector of employment were similar to those found in the HIES and LFS samples, response rates were far lower in rural areas—compared with urban areas—due to limited network coverage. To adjust for diferences in characteristics between the baseline and subsequent rounds, it was necessary to apply an attrition adjustment to the baseline survey weights. Te adjustment included a propensity score adjustment, which uses the available characteristics of the household head from the baseline survey (age, gender, location, and sector of employment), and a post-stratifcation adjustment. Tis increased the total weighting of each stratum to match the distribution found in the last census. Full details of the weighting methodology can be found in World Bank (2014), and each report contains a table showing the regression results underlying the propensity score calculations on which the weighting adjustments were based. Even after taking into account these adjustments, however, careful review is necessary to determine if the results from the mobile phone survey can truly be considered representative, as opposed to merely indicative (Fig. 4).

Another lesson learned was to keep the survey short. While households can and will participate in a mobile phone interview, the questionnaire must be kept short to minimize respondent fatigue, which can be a cause of attrition and non-response. Mobile phone-based surveys are therefore not appropriate for lengthy interviews or complex questions, such as those relating to household consumption. Mobile phone surveys also cannot substitute in-depth information that can be collected in face-to-face household surveys.

While felding new ad hoc surveys to monitor an evolving crisis is possible (see Chapter 3), a more systematic approach is clearly preferable. If a representative mobile phone survey could be carried out on short notice, this would not only provide valuable real-time

#### **24 A. Etang and K. Himelein**

**Fig. 4** Response rates for the high-frequency mobile phone surveys in Sierra Leone and Liberia (*Source* Authors' calculations)

information, but could also be used to mount an efective response. Te high-frequency mobile phone surveys to monitor the socio-economic impacts of Ebola in Sierra Leone and Liberia were possible because the most recent national household surveys had collected contact information. A proactive approach to crisis monitoring would start with the systematic creation (and maintenance) of databases with phone numbers and core household respondent characteristics. Another lesson from the Ebola crisis is that setting up a call center is relatively straightforward and can even be done from abroad.

## **Annex 1: Links to Ebola Reports**

Four reports were produced using the fve rounds of the High-Frequency Cell Phone Survey on the Socio-Economic Impacts of Ebola in Liberia:

Te socio-economic impacts of Ebola in Liberia: results from a high frequency cell phone survey (rounds one and two)—released in November 2014: http://documents.worldbank.org/curated/en/2014/11/24048037/ socio-economic-impacts-ebola-liberia-results-high-frequency-cell-phone-survey.


Tree reports were produced using the three rounds of the High-Frequency Cell Phone Survey on the Socio-Economic Impacts of Ebola in Sierra Leone:


# **Annex 2: Listening to Africa, Nutrition and Food Security Questionnaire**

Today, we would like to ask you about food consumption in your household.

## **Nutrition**



# **Food Security**

B1. In the past 7 days, did you worry that your household would not have enough food? Answer:

1=Yes 2=No



B4. In the past "X" months [number of months since the last survey on this topic], have you been faced with a situation when you did not have enough food to feed the household? Answer: \_\_\_\_\_\_\_

1=Yes 2=No≫B7

#### **28 A. Etang and K. Himelein**

B5. When did you experience this incident in the last "X" months [number of months since the last survey on this topic]?

MARK X IN EACH MONTH OF 2016 WHEN THE HOUSEHOLD DID NOT HAVE ENOUGH FOOD.

LEAVE CELL BLANK FOR FUTURE MONTHS FROM INTERVIEW DATE OR MONTHS MORE THAN "X" MONTHS AGO FROM INTERVIEW DATE [number of months since the last survey on this topic].


B6. What was the cause of this situation? LIST UP TO 3 [Do not read options. Code from response].


Codes for B6:




B8. In case of food shortage, who eats less? Answer: \_\_\_\_\_\_\_


# **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **3**

# **Rapid Emergency Response Survey**

**Utz Pape**

# **1 The Data Demand and Challenge**

In 2017, the United Nations (UN) stated that the world was facing the most serious humanitarian crisis since the Second World War, with over 20 million people at risk of starvation and famine.1 Te crisis was concentrated in four countries: Nigeria, Somalia, South Sudan, and Yemen. Alongside hunger, large portions of the population in these countries were facing deteriorating living conditions and threatened livelihoods.2

U. Pape (\*)

World Bank, Washington, DC, USA e-mail: upape@worldbank.org

<sup>1</sup>https://www.theguardian.com/world/2017/mar/11/world-faces-worst-humanitarian-crisis-since-1945-says-un-ofcial.

<sup>2</sup>Food Security Outlook Update Nigeria, Famine Early Warning Systems Network (2017); Post-Gu Technical Release Somalia, Food Security and Nutrition Analysis Unit and Famine Early Warning Systems Network (2017); Food Security Outlook Update South Sudan, Famine Early Warning Systems Network (2017); Food Security Outlook Update Yemen, Famine Early Warning Systems Network (2017).

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_3

Te crisis was driven by both drought and confict to difering degrees in the four countries. In Nigeria, the Boko Haram confict contributed to poor market access, severe food shortages, and disruption of livelihoods in the North-Eastern States.3 For the Somali population, the dry agricultural season contributed to high food prices, livestock losses, and displacement.4 In South Sudan, below-average crop production and inter-communal violence contributed to famine in the former Unity State; in addition, 70% of the population of South Sudan was in serious need of humanitarian assistance.5 In Yemen, airstrikes and violent clashes on the ground kept food prices high and resulted in high dependency on food imports and emergency aid.6

Te crisis required a response along the humanitarian-development nexus, to address urgent humanitarian needs while working toward short- to medium-term socio-economic development goals. Te UN and the World Bank have worked to synchronize their responses to crises to the greatest extent possible.7 Greater development can improve resilience and reduce fragility, so that future shocks do not automatically lead to humanitarian catastrophes.8

During the crisis, rapid data collection was required to assess the population at risk of famine. Traditional survey methods were unsuitable for a variety of reasons. First, results were needed urgently, so lengthy household questionnaires were inappropriate. Second, funding constraints meant that costly traditional surveys were also unfeasible. Tird, a signifcant portion of the afected populations was believed to be located in confict-afected areas, where face-to-face data collection

<sup>3</sup>Food Security Outlook Update Nigeria, Famine Early Warning Systems Network (2017).

<sup>4</sup>Post-Gu Technical Release Somalia, Food Security and Nutrition Analysis Unit and Famine Early Warning Systems Network (2017).

<sup>5</sup>Te UN ofcially declared famine in parts of Unity State, South Sudan: https://unmiss.unmissions.org/famine-declared-parts-south-sudan; Key IPC Findings: January–July 2017, Integrated Food Security Phase Classifcation (2017).

<sup>6</sup>Food Security Outlook Update Yemen, Famine Early Warning Systems Network (2017).

<sup>7</sup>Making the Links Work: How the humanitarian and development community can help ensure no one is left behind, Inter-Agency Standing Committee (2014).

<sup>8</sup>New Way of Working, United Nations Ofce for the Co-ordination of Humanitarian Afairs (2017).

is very risky. Given the context, there was a need for a survey that was low-cost, fast, and technically feasible. Data collection needed to be launched and completed in a matter of days, while also ensuring the safety of the implementing teams. Convincing sampling frames had to be obtained in environments where existing data was scarce. Finally, the crisis was unfolding in four diferent contexts, and country-specifc approaches were required that were both standardized yet adaptive.

## **2 The Innovation**

Te Rapid Emergency Response Survey (RERS) was designed with standardized survey protocols that can be implemented quickly in times of crises. It was designed as a phone survey to allow rapid access to populations at risk of famine, and can be carried out by local call-centers at low cost.9 During the crisis, enumerators recorded data digitally and uploaded it every day to a cloud-based server, in order to map and update data trends on a daily basis.

Te questionnaire was quick to administer, yet still included a broad range of development topics that might need to be better understood during a crisis. A maximum administering time of about 20 minutes was necessary for many reasons: Phone networks often have weak connectivity, making long interviews difcult, respondents have shorter attention spans over the phone compared to face-to-face interviews, and minimizing respondent fatigue was crucial to increasing the accuracy of the data and to avoid burdening potentially stressed respondents. However, the questionnaire must also provide a wide snapshot of the population's conditions, investigating a comprehensive set of topics including education, livelihoods, health, market access, food security, and water, in order to identify which have been most afected by the crisis and to inform a response.

<sup>9</sup>Te call centers were located in-country for Nigeria, South Sudan, and Somalia, and in Egypt for Yemen.

Te survey covered the mobile phone users, with a focus on areas deemed to be in 'emergency' or worse by the Integrated Phase Classifcation (IPC).10 In order to participate in the survey, households had to own a mobile phone, have network coverage, and a means to charge the phone. As such, one key limitation of mobile surveys is that it excludes households too poor to have a mobile phone, or households that are too isolated to live in areas with coverage. Despite this shortcoming, the survey allows for an immediate, ground-level assessment of challenges related to the crisis, and the survey's results can be considered conservative estimates of how the entire population is afected, leading to insightful policy interventions.

Sampling strategies must be adaptable to local contexts. In Nigeria, an ideal starting point was to call respondents from previous surveys who represented the intended population, since phone numbers had been collected for previous waves of this survey.11 Tis approach allowed for the comparison of RERS estimates to estimates from the existing survey, which included the non-phone-using segment of the population. Household characteristics can thus be compared between the sample from the previous survey and the RERS, so that the representativeness could be assessed. About 80% of the phone numbers called resulted in successful interviews.

In the absence of existing surveys, a comprehensive list of phone numbers disaggregated by region would provide the best sampling frame; however, such lists are often unavailable, unreliable, or outdated. A bulk SMS to mobile phone users asking for consent to participate in a survey can provide an alternative sampling frame, from which respondents can be randomly selected. To ensure that the crisis-afected population is represented, it is crucial that any bulk SMS can geographically target crisis-afected regions: Tis approach was followed in Somalia. While this methodology is efective, it further compromises the representativeness of the survey by requiring respondents to reply to

<sup>10</sup>Guidelines on Key Parameters for IPC Famine Classifcation, Integrated Food Security Phase Classifcation (2016).

<sup>11</sup>General Household Survey 2016, conducted by National Bureau of Statistics of the Federal Government of Nigeria under Poverty and Confict Monitoring Systems.

a text message before being interviewed. More than 65% of the numbers called resulted in successful interviews, allowing for fast execution of the survey. However, the actual number of recipients of the bulk SMS is unknown, making it difcult to calculate the percentage of SMS recipients who were interested in participating.

Random digit dialing (RDD) is a tool that randomly generates phone numbers, and can be a practical solution when phone number lists from surveys or bulk SMS dashboards are not available. Tis approach was followed in South Sudan and Yemen. Machine-learning algorithms can generate random-digit sequences based on a small set of verifed existing numbers to create new numbers that are likely to exist. Tis reduces the loss of time that results from calling non-existent numbers. However, response rates are still unpredictable, especially if the survey targets specifc geographic areas. On average, 10% of the numbers called resulted in successful interviews, prolonging the survey's duration.

## **3 Key Results**

Tis section describes the collected data and highlights selected trends, starting with similarities between the four countries, and followed by selective country-specifc deep-dives. In Nigeria, the survey involved households located in the North-East, North-Central, and South-South zones; of these, the North-East zone includes states classifed to be under the emergency phase as per the IPC. For Somalia and South Sudan, only areas declared to be in a state of emergency or worse were surveyed. In Yemen, the survey covered all regions, stratifed into emergency and non-emergency areas: Non-emergency regions are sampled because they had pockets of highly food-insecure households.

Te proportion of highly food-insecure households was found to be large, but varied widely between the four countries, ranging from 30% in Somalia, to around 50% in Nigeria and Yemen, to over 70% in South Sudan (Fig. 1).12 Higher food insecurity was recorded in

<sup>12</sup>Food security scores are based on the Reduced Coping Strategies Index Score and adapted to defne lower scores for less food-secure households. Reduced Coping Strategies Index Score

**Fig. 1** Food security score by country (*Source* Author's calculations based on RERS data)

those countries that faced confict during the crisis. A high incidence of confict was reported in Nigeria, South Sudan, and Yemen, while in Somalia, the crisis was primarily due to dry agricultural seasons and a lack of resilience.

In addition to food insecurity, the populations surveyed also faced a range of developmental challenges. Livelihoods were afected in all four countries, with large portions of the populations (ranging from 31% in Nigeria to 84% in Yemen) facing a decrease in income and a change in their main source of livelihood (ranging from 13% in Nigeria to 31% in Somalia). Poor health, insufcient access to water, and low preparedness for drought were also common in all four countries. Other issues such as school attendance and livestock loss were more context-specifc (Table 1).

is calculated using the CSI Field Methods Manual, Cooperative for Assistance and Relief Everywhere (2008).




**Table 1**

(continued)

aDue to contextualization, data was not collected for certain topics marked in gray

bThe 'medium' category had a very low sample size in South Sudan, leading to unreliable results. Thus, it has been combined with the 'high' category *Source* Authors' calculations

**Fig. 2** Trends in income and food storage (*Source* Author's calculations based on RERS Nigeria data [Conducted with the National Bureau of Statistics of the Federal Government of Nigeria under Poverty and Confict Monitoring Systems])

In Nigeria, one in fve households lost income over the previous 12 months. Highly food-insecure households were more likely to experience a decrease in income than food-secure households (39 and 21% respectively; Fig. 2). Over one in three households did not store food for future use. Highly food-insecure households were most likely to not store food (39%) compared to households with low or medium food insecurity (28 and 29% respectively; Fig. 2). Early warning systems for drought preparedness and food-storage capabilities might allow for higher resilience and reduce the need for desperate coping strategies.

In Somalia, surveyed household had lost livestock and changed their employment activities. Over 30% of the Somali population owned livestock in the previous 12 months. However, among households that owned livestock, four in fve faced a decrease in livestock holdings, with the primary reason being death or disease (66%). Livestock had also been depleted from being sold of (9% of livestock-owning households; Fig. 3). Economic assistance to compensate for livestock losses was clearly necessary to increase household income. Te survey found that about 15% of households changed their main employment activity, with a shift away from farm-based employment (15–10% in farm labor; 20–13% in own-account farming), while the number of respondents involved in non-farm businesses increased sharply (17–28%; Fig. 3). Households responded to the drought by compensating for losses in agricultural income through shifts in the labor market.

**Fig. 3** Trends in livestock losses and employment (Somalia) (*Source* Author's calculations based on Somali RERS)

**Fig. 4** Challenges in accessing food markets and livestock losses (South Sudan) (*Source* Author's calculations based on RERS South Sudan)

In South Sudan, most households rely on markets to buy food. However, while food is generally available in markets, households often cannot aford to buy food because of high prices (Fig. 4). Tese high food prices are not surprising, given South Sudan's recent period of high infation. To improve access to food, the survey fndings emphasize the importance of vouchers as opposed to food imports.

Almost 40% of the South Sudanese population owned livestock in the previous 12 months. Of these, almost two in three households lost livestock due to death or disease, and theft (25 and 21% respectively;

**Fig. 5** Challenges in accessing food markets and reasons for low school attendance (Yemen) (*Source* Author's calculations based on RERS Yemen)

Fig. 4). Livestock restoration could help livestock-based agricultural livelihoods to be regained faster.

In Yemen, food prices and in particular, and school attendance were found to be afected by the crisis. Most households used markets to access food; as such, an increase in food prices was the key challenge in accessing the market for both food-secure and food-insecure households (Fig. 5). Again, this suggests that future interventions should be based on food vouchers rather than food imports. About one in four households had not sent all their children to school regularly in the previous year, largely due to unsafe routes to school and school closures. Safety issues and school closures resulted in low school attendance for both boys and girls (Fig. 5). Tis underlines the detrimental efects of insecurity on future generations, and the need to restore educational infrastructure.

## **4 Implementation Challenges, Lessons Learned, and Next Steps**

Te results of the Emergency Response Survey showcased above emphasize the importance of understanding the local situation, which is context- and country-specifc. A quick turnaround from survey inception to

**Fig. 6** Survey duration and implementation costs (*Source* Author's calculations based on RERS)

results and analysis is a key factor in the usefulness of the data to inform disaster responses. Te survey was completed the quickest in Somalia (2600 interviews in ten days), and slowest in Nigeria (around 600 interviews in 25 days; Fig. 6), with the response rates and the size of the enumeration team being key factors in the speed of data collection. In Somalia, the enumeration team consisted of 25 enumerators, which was fve times the size of the team in Nigeria. In South Sudan and Yemen, response rates were less than 15%, which increased the survey's duration. However, despite these various constraints, data collection was quick enough to generate results for each country in eight weeks. Te case of Somalia further demonstrates that very rapid data collection can be done with a reasonably sized team of enumerators, even in a constrained environment.

Tis survey methodology can be deployed rapidly while keeping costs low. Operating through country-based call centers cost roughly \$50,000 per country, and the cost per interview was less than \$35 in all countries except for Nigeria. Te interviews were most cost-efective in Somalia (\$23 per interview) and most expensive in Nigeria (\$86 per interview; Fig. 6). Te bulk of the costs were fxed, and thus a larger sample size drove down the cost per interview. Tis fxed-cost structure allowed for increasingly cheap future rounds of the survey once the call center infrastructure has been established.

Statistical infrastructure, such as a list of respondents with phone numbers, can accelerate data collection and improve the representativeness of the survey. In Nigeria, phone numbers collected as part of a nationally representative household survey were used to select respondents. Such approaches can save time in preparing the sampling frame for the survey, compared to negotiating with mobile phone providers to provide randomized lists of phone numbers or to send a bulk SMS. It can also minimize the legal implications of mobile phone-based surveys, as some countries do not allow large numbers of unsolicited phone calls. Using data from a nationally representative survey, as was the case in Nigeria, the representativeness of the collected data can be assessed by comparing respondents who participated in the phone survey with respondents who could not be reached, as well as to respondents who either did not provide or did not have a phone number.

Questionnaire design is a crucial step in preparing a survey. In the absence of quantitative data about the impact of the drought on a variety of areas, the questionnaire was designed to explore diverse developmental topics including education, livelihoods, health, remittances, prices, and market access. Survey data clearly indicated that certain topics were more seriously afected than others, warranting more detailed exploration that was impossible in the context of the initial survey. For example, in South Sudan, more than 90% of households had sufered an illness in the previous three months (Table 1); however, while the questionnaire was able to collect details about the most recent illness in the household, the module was insufciently deep to allow specifc conclusions regarding health interventions to be drawn. In hindsight, additional information on household member-specifc and less recent illnesses would have been valuable. However, the design of the questionnaire traded in-depth exploration against comprehensive thematic coverage. Another fnding from South Sudan was that remittances were not severely afected by the crisis; again, in hindsight, the questionnaire could have been optimized by adding questions on remittances. However, it is impossible to make these choices a priori, especially during an unfolding crisis situation.

Te use of an adaptive questionnaire is a promising approach to escape this limitation. Te premise is to adapt the questionnaire while

**Fig. 7** Proposed adaptive questionnaire design (Color fgure online)

collecting data. Te frst round of the questionnaire should cover a broad range of developmental topics, with an emphasis on preliminary questions assessing the extent to which each topic is afected. After around 500 interviews, trends from the data collection will indicate which topics warrant more exploration.13 A survey conducted through a call center allows for the rapid adaptation of the questionnaire, as well as splitting the sample into individually representative parts at no extra cost. Tus, the questionnaire can be adapted after every 500 interviews, with increasing levels of detail on relevant topics and subtopics (Fig. 7, colored green). Less relevant topics can be dropped to keep the duration of the interview manageable (Fig. 7, colored gray). Even saving fve minutes by skipping preliminary questions on irrelevant themes can save crucial time in a 20-minute interview for more in-depth exploration of relevant topics.

Adaptive questionnaire design fts well within the fast, low-cost survey methodology of the RERS. Enumerators can be trained on the full, detailed version of the questionnaire before data collection to allow quick adaptation of relevant and irrelevant topics (Fig. 8). Te design will create systematically missing values for detailed questions in interviews conducted at the beginning of data collection, and for explorative

<sup>13</sup>Around 350 observations is the minimum sample size that provides a 95% confdence interval for estimates. Tus 500 is a sufcient sample size to map early data trends.

**Fig. 8** Proposed timeline for adaptive questionnaire design with RERS methodology

questions later in the implementation. Te random sequence of interviews, however, ensures that the missing data is not biased and, thus, can be analyzed by ignoring missing values. While this will afect standard errors, a large sample, such as the 2500 interviews in Somalia, can ensure sufciently narrow confdence intervals, even after several adaptations of the questionnaire.

By presenting the example of a pilot survey, this chapter provides proof-of-concept that quantitative data collection via phone surveys is feasible, cost-efective, and informative in the context of shock responses. Te pilot highlighted the importance of using an efective questionnaire design, like adaptive questionnaires, to balance the need to comprehensively cover a wide range of topics with the need to collect detailed information on specifc sub-topics identifed as part of the survey. Although smart and innovative designs can optimize trade-ofs, emergency response surveys are no substitute for face-to-face household surveys based on representative sampling frames, ensuring that everybody is included, even the poorest and most vulnerable, who might not own mobile phones. However, compromises need to be made to provide a timely response following a shock, creating a niche for emergency response surveys as presented in this chapter.

Such emergency phone surveys can be prepared and implemented at global and national levels. At the global level, an adaptive questionnaire template can be prepared before emergency situations occur. Tis will reduce the preparation time needed to adapt the questionnaire template to a country and a specifc crisis. At the country level, the groundwork for a survey can be prepared by collecting phone numbers of potential respondents. Questions about phone numbers and the willingness to participate in future survey interviews should be included by default in nationally representative surveys at both the household and frm levels. Lists of phone numbers of respondents who are knowledgeable about specifc topics can further add value and allow for more in-depth interviews. Tese can be obtained by reaching out to sector ministries to collect phone numbers for their staf, which may already be integrated into the HR system. Te ability to call, for example, health workers or police ofcers across an entire country allows for many more monitoring options that would not only be relevant in emergency situations. National statistics ofces often maintain such (sufciently anonymized) phone number databases, and provide them in emergency situations. In crisis-prone countries, the establishment of a call center and, potentially, the use of continuous phone surveys can also further accelerate implementation and provide baseline data.

Emergency phone surveys can play a critical role in crisis analytics, especially if integrated with other data sources that are either recurrently collected in a country, available upon demand, or typically collected during a crisis. For example, market price data is collected in most countries, whether by national statistics ofces, UN agencies, or both, and can be triangulated geographically with household interviews. Satellite images can also provide additional context specifcally to household interviews, for example, by gathering information about agricultural activity or damage to physical infrastructure. Furthermore, social network data or mobile phone usage data can provide invaluable insight. However, access to such datasets often takes time, making pre-emergency agreements necessary. UN agencies have developed standard data collection methodologies for crisis situations, including the WFP's Vulnerabilities Assessment and Mapping (VAM), and IOM's Displacement Tracking Matrix (DTM). For a meaningful integration, the diferent underlying data sources should be readily available and anonymized at the micro level, including sufciently disaggregated geographical indicators. Tis requires pre-emergency agreements between involved agencies, ideally at both global and national levels. Tis is a valuable efort for the avoidance of redundancies and the creation of new synergies. Tus, technological advancements make crisis analytics an extremely powerful tool to inform crisis responses. To harvest their full potential, more pilots must be carried out, and specifc statistical infrastructure as well as collaborations and agreements between diferent stakeholders will be needed.

Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **4**

# **Tracking Displaced People in Mali**

**Alvin Etang and Johannes Hoogeveen**

# **1 The Data Demand and Challenge**

For decades prior to the 2012 rebellion, political leaders in northern Mali asserted that their people were marginalized and consequently impoverished. Separatist groups staged unsuccessful rebellions in 1990 and in 2007. In 2012, however, many of those fghting in the rebellion had received training from Gaddaf's Islamic Legion and were experienced with a variety of warfare techniques, and the rebellion that started with attacks on the Malian army in Menaka in mid-January 2012 culminated in a coup d'état by March 2012 and an attempt to take over the country by force. Te three northern regions of Mali, Gao, Timbuktu, and Kidal became occupied by various rebel and Islamist factions until early 2013, when a coalition composed of the Malian

World Bank, Washington, DC, USA

A. Etang · J. Hoogeveen (\*)

e-mail: aetangndip@worldbank.org

J. Hoogeveen

e-mail: jhoogeveen@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_4

**Fig. 1** Population pyramids before and after the 2012 crisis (*Source* Mali census data for 2009, INSTAT 2012; authors' calculations using January 2016 Permanent Monitoring baseline survey)

Army, French troops, and the ECOWAS-led International Support Missions to Mali (AFISMA) recaptured the occupied areas. Fighting between the Malian Army and the rebel factions broke out again in May 2014, and even though a peace accord was signed in June 2015, northern Mali remains insecure and contested.

At the height of the security crisis in Mali, over 500,000 people were displaced, nearly half of the estimated 1.2 million people who were living in the north (based on the 2009 population census). By October 2014, the number of displaced people was halved: the number of Internally Displaced Persons (IDPs) was estimated at 86,026, and the total number of Malian refugees was 143,471, with around 55,414 living in Mauritania, 53,491 in Niger, 32,771 in Burkina Faso, and 1330 in Algeria.1,2

Te impact of the crisis on the population of northern Mali can be illustrated by looking at the age structure for the population in the north. Prior to the crisis, the population pyramid for the three northern regions was comparable to that of the entire country, but by 2015, the population pyramid for the north had changed considerably, refecting the vast population movements that occurred during the crisis (Fig. 1). Te biggest change occurred among children aged ten or younger.

<sup>1</sup>UNOCHA (November 2014): Mali: Evolution de Movements de Population. 2See UNHCR: http://data.unhcr.org/SahelSituation/country.php?id=501.

Information on the wellbeing of refugees and IDPs is typically hard to come by (Verwimp and Maystadt 2014), but is needed to formulate a response to the crisis. Information on returnees is particularly difcult to access. Te reason for this is obvious: while it is relatively straightforward to interview people while they are displaced, tracking them after their return is much harder.

## **2 The Innovation**

Te Listening to Displaced People Survey (LDPS)3 set out to address the information vacuum around the living conditions of displaced people and returnees. It did so in two ways. First, a baseline face-to-face survey was implemented that exclusively sampled displaced people, refugees, and returnees. Identifying the three target populations was made possible by the fact that each of these groups could be found in an identifable location. Many displaced people were hosted by families in Bamako and had been registered by UN agencies; refugees were living in camps across the border, and returnees had returned to their locations of origin, predominantly in the northern cities of Gao, Kidal, and Timbuktu.4 Tis approach to identifying returnees was possible because by August 2014, when the baseline survey was implemented, many displaced people had already started to return (see Fig. 2). Te majority had returned between June and October 2013, a period that followed the signing of a peace deal between the interim government and the rebel factions to allow presidential elections to be held in July and August 2013.

During the baseline survey, information was collected on a range of household characteristics, including household composition, assets and

<sup>3</sup>Questionnaires, data and metadata of the LDPS are publicly available and can be downloaded from: http://bit.ly/2nsxSd6.

<sup>4</sup>It should be emphasized that locations were not randomly selected. Bamako was selected because it hosted a large number of IDPs, while the main cities in the north of Mali were chosen in order to obtain a large sample of returnees, given the available funds. A refugee camp in Niger was also chosen, as bureaucratic issues did not allow for the inclusion of a camp in Burkina Faso.

**Fig. 2** Timing of return (percentage) (*Source* Authors' calculations based on the Mali Listening to Displaced People Survey)

income sources, as well as food security and experiences during the crisis. Te baseline survey also asked perception questions about trust, security, about changes in wellbeing and perspectives on the future.

To track living conditions over time, the baseline survey was complemented with follow-up mobile phone interviews. Tis approach had the added advantage that if households chose to return during the research period, they remained within the sample. Te ability to trace displaced people while they were still on the move was the most important innovation of the LDPS.

Te baseline survey was used to identify respondents for the mobile phone interview. Because the survey intended to ask questions about perceptions and was seeking to be representative of the adult population, it was important that one adult was identifed from within each household to be the main respondent throughout the survey period. It was equally important for the sake of representation that the person was not always the head of household. As a result, within each household, one person was selected randomly from all household members above the age of 18. Respondents were equally split between men and women to obtain a good representation of the opinions of both genders.

Upon completion of the baseline interview, all respondents received a mobile phone to avoid bias with regard to phone ownership. Mobile interviews were conducted in monthly intervals, using a specialized call center in Bamako. Interviews were conducted in the relevant local languages, French, Bambara, Kel-Tamashek, or Songhai. During the phone interviews (lasting 20–30 minutes) structured questions were asked about the welfare of the household and changes in location, as well as perception questions. Upon completion of the interview, respondents received a small token of appreciation in the form of US\$2 worth of phone credit.

Over a period of twelve months, from August 2014 to August 2015, monthly interviews were conducted. Te original sample comprised 501 respondents (51% men, 49% women) split between IDPs located in the capital city of Bamako (100), refugees living in refugee camps in Mauritania (100) and Niger (81), and returnees living in northern Mali, in the regional capitals of Gao (90), Timbuktu (80), and Kidal (50).

## **3 Key Results**

Te households in the sample only comprise displaced or formerly displaced people, so to investigate how those in the sample compare to non-displaced households, they need to be compared with existing data. Figure 3 illustrates the comparison for level of education, against baseline data collected prior to the crisis in 2011. It compares levels

**Fig. 3** Level of education of population aged 18+ (percentages) (*Source* Authors' calculations using the Listening to Displaced People Survey and the Enquête Modulaire et Permanente, EMOP 2011, of the Mali National Institute for Statistics, INSTAT)

of education of adults in the four cities of Bamako, Gao, Timbuktu, and Kidal. It is important to note that levels of education in Mali are extremely low. Even in the capital city of Bamako, more than half of the adults have not progressed beyond primary education, while in Kidal and Timbuktu, 80% completed primary education at most. In comparison, IDPs and returnees are better educated, aside from those in Gao. IDPs in Bamako have levels of education comparable to the general adult population of Bamako, which is higher than that of the urban population in the north. Returnees are also more likely than the overall populations of Kidal and Timbuktu to have achieved secondary education or higher.5

Refugees, in contrast, are less educated. In particular, refugees who went to Niger have lower levels of education than the overall population of northern Mali.

Regarding consumer durables, all three sub-populations, IDPs, refugees, and returnees were revealed to have higher levels of ownership than the average citizen of the north. As such, despite the loss of consumer durables due to the crisis, IDPs, refugees, and returnees still own more than or similar amounts of assets to the average population of the north prior to the crisis. Tis is shown in Fig. 4, which presents the proportion of IDPs, refugees, and returnees who own assets after the crisis and compares this with the percentage of households who owned assets prior to the 2011 crisis in Gao, Timbuktu, and Kidal. Te value of assets owned by IDPs and refugees was found to be comparable to that of households between the third and fourth wealth quintiles, locating displaced peopled in the middle or upper-middle classes. As with education, displaced people's levels of asset ownership are more comparable to those of the average citizen in Bamako rather than the average citizen of the urban areas of Gao, Timbuktu, and Kidal.

Tis fnding that displaced people were better of than others is confrmed by Peña-Vasquez and Mueller (2017), who use the same database. Tey conclude that people were more likely to opt for displacement when they felt more at risk, when they were relatively better

<sup>5</sup>Some of the results presented in this section have also been reported in Etang Ndip et al. (2016).

**Fig. 4** Asset ownership compared with regional average (*Source* Authors' calculations using the Listening to Displaced People Survey, 2014 and the Enquête Modulaire et Permanente, EMOP, 2011 of the Mali Institute of Statistics (INSTAT))

of, and interestingly, when they lived in villages with greater access to transportation, either by land or water.

Te main purpose in tracking displaced people, for the purposes of this chapter, is what the survey can tell us about their living conditions over time. Te results show how the respondents' perception of their living conditions changed over time and across locations. In wave 12 in Kidal, for instance, there is a large decrease in the proportion of respondents stating that their living conditions were worsening, and an increase in respondents stating that they remained the same. Tis wave followed the signing of the Peace Accord in June 2015; however, the optimism found in Kidal at this time was not shared by the other three cities covered by the survey (Fig. 5).

Te data collected takes the form of a longitudinal (panel) dataset, which allows to control for individual fxed efects. Hoogeveen et al. (2019) exploit the panel nature of the dataset to investigate the drivers of the decision to return, exploring how employment status, security, and expectations afect people's willingness to go back home. Te fndings suggest that the decision to return is afected by a comparison of (opportunity) costs and benefts, but also by other factors: Individuals who are employed while displaced are less willing to return home, as are better-educated individuals, or those receiving assistance. Te

**Fig. 5** Changes in perceived living conditions over the duration of the survey (*Source* Authors' calculations based on Mali Listening to Displaced People Survey)

opposite is true for ethnic Songhais and people from Kidal. Te results show that individuals with higher levels of education do better when displaced, and if they return, they fnd jobs more easily than those with less education.

Using all twelve waves of the survey, Hoogeveen et al. ran a fxed efects linear probability model. Tese individual fxed efects capture all time-invariant individual characteristics such as ability, education, and stamina, as well as several stable household characteristics and environmental factors (e.g. attitude toward refugees or IDPs in the local community), while the time fxed efects control for events specifc to a time period, such as weather shocks or military events. Tey fnd that those who found employment while being displaced were signifcantly less likely to return, while refugees and those who owned a gun were more likely to return (Fig. 6).

**Fig. 6** Fixed effects regression on the decision to return (*Source* Hoogeveen et al. 2019)

## **4 Implementation Challenges, Lessons Learned, and Next Steps**

Te success of the tracking survey depended on the ability to maintain a stable sample. Te measures employed were not unlike those discussed in Chapter 2: respondents received phones, were rewarded for participation with phone credit, and were given the opportunity to carry out the interview in their own language. Te survey team emphasized approaches that might reduce drop-out, e.g. respondents were asked to indicate the time at which they preferred to be called. During the call, they would always speak to the same enumerator, thus building rapport. In the refugee camp in Mauritania, response rates declined due to weak network coverage. Tis was resolved by working with feld-based enumerators who relayed the responses back to the call center in Bamako. Te team also asked community members to follow up on respondents who could not be reached over the phone. Tis tracking mechanism was set-up at the survey design stage by collecting alternative phone numbers of the respondents such as phone numbers of other household members, friends, and neighbors. Tis helped enumerators reach respondents who did not answer their own phones. Tese measures were efective: the non-response rate was very low, between 1 and 2% per wave. Te percentage of households not responding to more than two consecutive rounds, was even lower, only 0.8%. Attrition rates bore little relation to the movement of the respondent. For instance, in the area with the highest amount of movement, Bamako, the initial sample comprised 100 households. Of these, 12% indicated one year later that they had moved, but only one household dropped out of the sample. A similar fnding holds true for Gao, where the sample initially comprised 90 households, and although some 7% moved, only two households dropped out of the sample.

Not only is the stability of the sample quite remarkable, but this survey also demonstrates that mobile phone surveys are useful tools for collecting data in hard-to-reach places. Te case of Kidal, a desert town, illustrates this point. Kidal lies in a remote corner of northern Mali and is only accessible by 'piste' (i.e. unmarked dirt road), and the nearest town, Gao, is 285 km away. Moreover, during the period in which the data were collected, the government of Mali exercised no control over the town. Despite these factors which would normally greatly hinder data collection, the mobile phone survey collected information on a monthly basis with response rates that were near-universal (see Fig. 7, right panel).

**Fig. 7** Attrition rates (*Source* Authors' calculations using the Mali Listening to Displaced People Survey)

Te ability to follow respondents as they change locations ofers exciting new possibilities for welfare monitoring, as movement is often associated with large societal changes in welfare. We know, for instance, that rural-to-urban migration is associated with declining poverty of the movers in a process called structural transformation, in which increases in agricultural production facilitate rural–urban migration by increasing rural incomes while simultaneously suppressing (urban) food prices. Once this process starts, markets become more important, the nonfarm and agribusiness sectors grow, and the food value chain and rural– urban linkages are strengthened. As rural incomes grow even further, second-order efects emerge: the stock of human and physical capital increases as households invest part of their increased incomes in their ofspring. Tis leads to further productivity gains, and to emigration of better-educated people. While this process is well-understood, surprisingly little is known about how individual migrants fare during the process of transition. Nor is much understood about the characteristics of successful migration, as opposed to migration in which one ends up chronically poor in an urban slum. Mobile phone tracking surveys can be used to collect the data needed to fll this knowledge gap, and can be applied equally to returning IDPs and refugees, to school leavers, to those completing a job training program, or those having gone through a DDR program.

## **References**


Peña-Vasquez, A., and D. Mueller. (2017). Consequences of Confict: Forced Displacement, Insecurity, and Transportation in Northern Mali. Mimeo.

Verwimp, P., and J.-F. Maystadt. (2014). "Forced Displacement and Refugees in Sub-Saharan Africa. An Economic Enquiry." Policy Research Working Paper no. WPS 7517, World Bank Group, Washington, DC.

Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Resident Enumerators for Continuous Monitoring**

**Andre-Marie Taptué and Johannes Hoogeveen**

# **1 The Data Collection Challenge**

Te confict in Mali in 2012 broke out after a long period of political and economic stability. It began when armed separatist groups occupied the northern desert and semi-desert regions. A period of instability followed, during which an estimated 36% of the total population from the afected regions fed to the south of Mali and to neighboring countries. Te crisis had dramatic efects on public infrastructure and service and reduced people's mobility and their access to markets. It also led to the destruction and theft of assets and shook investor confdence. Farmers were cut of from their felds, artisans were unable to sell their produce as tourism came to a halt, traders were unable to move, and breeders

World Bank, Washington, DC, USA

e-mail: jhoogeveen@worldbank.org

A.-M. Taptué (\*) · J. Hoogeveen

e-mail: ataptue@worldbank.org

J. Hoogeveen

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_5

with high numbers of livestock were forced to leave confict-afected areas for safer places, losing many of their animals to theft along the way. Te crisis reinforced the feeling of neglect by the Malian state among those from the afected areas, while simultaneously strengthening cross-border ethnic loyalties and economic ties. Te confict ofcially ended with the Peace Accords signed in May and June 2015, but the North remains insecure, as it has become a safe-haven for terrorists and criminals.

Te crisis created distrust between diferent ethnic groups and among people of diferent religious afnities. Social cohesion weakened, and interactions became more restricted, inducing a feeling of fear. About one in three living in Timbuktu or Gao reported in July 2016 that they did not feel safe at home at night; in Kidal, this rose to two in three. Many people distanced themselves from social networks, neighbors became estranged, mixed marriages ended, and even within families, members became wary of each other. Animosity was also expressed toward the government. By July 2016, 53% of the population had lost confdence in the government, and confdence in the judicial system ranged from 66% in Timbuktu to as low as 8% in Kidal.

Collecting data in these circumstances is challenging, especially for emissaries of the central government. In fact, since the outbreak of the confict in 2012, agents of the National Institute of Statistics have been unable to collect any data in Kidal or elsewhere in northern Mali. Data, however, was urgently needed to monitor the developments post-signing of the Peace Agreement. Te Peace Agreement had established conditions for the restoration of stability and economic recovery, and called for development planning and new investments in the north, as well as the creation of a monitoring system to assess the impact of assistance on security, socio-economic development, and wellbeing.

Te Permanent Monitoring System (PMS) was created to respond to this data challenge. It consists of an observatory that relies on local enumerators living in northern Mali, who collect data on a monthly basis. Te PMS amasses information from a representative sample of households living in the targeted areas,1 and from local authorities, clinics, schools, and markets, where commodity prices are collected.2

## **2 The Innovation**

When enumerators from outside a community are not welcome or when travel to and within a region is dangerous for outsiders, one solution is to work with locally recruited enumerators who reside in the area. Te use of such 'resident' enumerators is usually discouraged, particularly for consumption surveys as experience shows that due to limited possibilities for supervision, data quality tends to erode over time, while respondents tend to grow tired of answering detailed questions repeatedly about the various items they have consumed. For this reason, many consumption surveys have shifted from collecting consumption data using diary-methods to approaches that rely on recall. In the former approach it is often necessary for enumerators to stay in the village for up to a month; using recall methods survey teams can stay in the village for much shorter periods of time.

<sup>1</sup>As many households had fed the area, old sampling frames were no longer valid. To assure a representative sample was none the less collected each enumerator had a list of local landmarks as well as a direction to move. Enumerators would start at the landmark and sum the date of the day till one digit was obtained. Tat was the number of the frst household to be interviewed counting from the landmark. Subsequently every second (rural areas) or every ffth household was interviewed till a total of 5 households was interviewed after which they would move to the next landmark. Te selection of individuals within the household to answer the questionnaire was conducted as follows: the head of household (male or female) was selected to answer the frst part of the questionnaire dealing with general questions about the households. Using the roster of household members which was compiled during the frst part of the interview, another member of the household aged 18 or above was selected randomly to answer the second part of the questionnaire in which perception questions were asked. Alternation between male and female was ensured. Te survey thus generated data that are refective of the opinions of those aged 18 and above in northern Mali.

<sup>2</sup>All data are made publicly available (http://www.gisse.org/pages/miec/suivi-permanent1.html), and reports have been widely disseminated.

Tere are major advantages to an approach that relies on enumerators that reside for longer periods in a village. Among these are that resident, locally recruited enumerators know the survey areas well. Tis reduces many of the complexities associated with insecurity, local grievances, or language. Te latter is a critical advantage. Ethnolinguistic fractionalization is high in Africa and in many locations, the ability to speak the language of choice of the respondent is key to the success of a survey. When enumerators cannot phrase questions in the language that respondents are most comfortable with, responses may be wrong or biased.

Another advantage of using resident enumerators is that, contrary to survey teams that visit an enumeration area for a short period of time, resident enumerators have ample time to carry out interviews. Capitalizing on this, enumerators of the PMS were asked to administer fve diferent survey instruments including a household survey that contained multiple modules among which socio-demographic characteristics and income-generating activities, including detailed questions on agriculture, livestock, fsheries, and entrepreneurial activities. Te survey collected information on assistance received, the return of refugees and internally displaced people, the health of household members, and food security. Te household survey also covered shocks that households might have experienced, possession of assets, and access to services. A part of the questionnaire was devoted to subjective questions about the implementation of the Peace Agreement, perceptions of security, and priorities for initiatives that could consolidate peace and security in the region.

A second survey instrument was used to interview local authorities (mayors, traditional authorities, and local chiefs) to collect information about local initiatives and interventions, as well as information about the evolving security situation. A third survey was administered to assess the operations of health care centers in the surveyed villages. Tis survey assessed the impact of the crisis on their functioning, the presence and return of staf that had fed during the confict, the assistance they received, and their needs in terms of supplies and equipment. A fourth survey was conducted in primary schools in the surveyed villages. Like the clinic survey, it assessed the presence of staf

and the return of teachers who had fed during the crisis, the assistance the schools had received, and the school's needs. Te ffth and fnal survey instrument collected information on prices of a selected list of commodities to gauge the changes in the cost of living in diferent localities.

Another advantage of resident enumerators is that they are in a better position to deal with 'moving' populations such as herders, of which there are many in northern Mali. Herders move about, few have mobile phones, and even if they do possess them, they are often out of range of a telephone network. Tis means that phone interviews as discussed in Chapters 2 and 3 are not feasible, particularly if non-random non-response is to be avoided. Te ability to deal with moving respondents is determined by the time availability and local knowledge of resident enumerators. Locally recruited enumerators know where to fnd pastoralists as they regularly gather at specifc locations to water their animals, or as they move from pasture to pasture following well-established grazing patterns. Herders are not the only mobile population. In many places in Africa, farmers also move. During the growing season, many remain at their felds in temporary shelters only to return to their village after the harvest. Enumerators selected from the village can follow households to their farms for interviews. Tey know the area and, unlike survey teams visiting enumeration areas for only a short period, have more fexibility in when to carry out an interview. Tey can meet respondents in the evening, early in the morning, at the market, or at the place where the respondent carries out his or her business.

Once enumerators have developed good relationships with the respondents, and respondents have confdence in them, resident enumerators are more likely to elicit accurate information, particularly when questions are sensitive, and the enumerator is able to emulate that responses will be kept confdential. Tis is another advantage of resident enumerators: it dispels fear and creates trust, trust that can be difcult to establish between people from diferent localities or ethnic groups. Te importance of this cannot be underestimated in a (post)confict situation, as it is not uncommon for respondents in insecure locations to fear reprisals for having provided information to an outsider, no matter how innocuous the information may seem. However, if the enumerator stays with the respondents in the village, it signals trustworthiness and allays such fears.

Te fexibility and level of trust the locally recruited enumerators built in northern Mali allowed them to collect high-quality data and to assure high response rates over the course of a year. Between January 2016 and January 2017, the highest household non-response rate encountered was 4.4% in October 2016 in Gao, when ten households did not respond to the survey; however, they resumed their participation in January 2017. During this particular month, the northern regions experienced 21 attacks and bomb explosions, including six in Gao, four in Timbuktu, and two in Kidal. Still these insecurity events did not disrupt the survey or cause the response rate to drop.

Tere were two clear challenges when it came to relying on resident enumerators. Te frst is that it may be difcult to identify skilled enumerators in the communities of interest. Particularly in remote locations, the number of suitable candidates may be limited. Few people are likely to have experience with survey data collection and fnding people with a certain level of formal education may be difcult. In the case of the northern Mali survey, the pool of eligible candidates was further reduced by the requirement that prospective enumerators had their own means of transport, allowing them to move about easily, while at the same time, assuring a greater sense of ownership and responsibility than one might expect for a project provided means of transport. A second challenge is supervision.

Hiring and managing enumerators was delegated to a local frm with extensive experience in data collection, and with a robust network in northern Mali.3 It advertised the positions on its website and mobilized its contacts in the region to publicize the job opportunity. To assure the independence and objectivity of enumerators, the frm avoided relying on local authorities for recruitment. Tose applying were expected to send proof of their education (diplomas) and of the

<sup>3</sup>Diferent contexts and data demands call for diferent solutions to this problem. E.g. in the context of a national public works program in the Central African Republic (LONDO project), the team used former locally recruited team leaders to collect information on how benefciaries used the bicycles they had received after the project had left the area. Teir reason to rely on these former employees was that the areas would be difcult of access by survey teams, while security costs would be prohibitive. Moreover, former team leaders had deep local knowledge and were able to fnd the benefciaries through local social networks as many have no phones.

possession of a motorbike (as a means of transport); these were later checked during enumerator training and the frst supervision mission. Enumerators were informed that they would not be engaged full-time, and could take up other employment as long as they were available for this activity for at least one year, and during the frst two weeks of each month.

Enumerators who had fnished at least secondary education were sought, but it proved challenging to fnd people with that level of education living in remote villages, as less than 5% of the population of northern Mali has completed secondary education or higher (INSTAT 2012). In the end, and after devoting much efort to identifying and hiring enumerators in each enumeration area, it was not possible to fnd sufciently qualifed people for each location. As a solution, certain enumerators were required to cover two or three villages close to one another, a solution that caused few problems since enumerators had their own motorbikes to move between villages.

For those who did qualify, wages were high. To complete some 20 questionnaires over a two-week period every month, enumerators were paid approximately US\$350 per month, plus a premium of US\$600 every quarter and again at the end of the operation. Tese premiums were needed to assure the continued participation of the enumerators, as other organizations active in the area were ofering competitive salaries. Although the budget for enumerator fees was relatively high, the overall cost of one round of data collection was reasonable: about US\$30,000 for approximately 800 questionnaires (12 households per enumeration area, plus school, clinic, district leaders, and price questionnaires), or less than US\$40 per questionnaire. Te reason for this relatively low unit cost, despite high salaries is due to the minimal expenses incurred for transport, printing and communications, and meals and lodging (Fig. 1).

In the end, 35 enumerators were hired. Since traveling to northern Mali was not recommended for people not from the area, all enumerators were invited to Bamako for one week of training. In addition to becoming familiar with the survey material, questionnaires and manuals, much emphasis was placed on how to behave, as the aim was for

**Fig. 1** Map indicating the location of enumerators across northern Mali (*Source* World Bank 2016)

the enumerators and respondents to develop an ongoing relationship for a period of more than one year. Hence, the training emphasized the importance of confdentiality, the importance of maintaining good relationships with respondents and local authorities, and the necessity of remaining neutral when collecting data.

Maintaining data quality was not an issue, as the response rates presented in Table 1 illustrate. Not only were enumerators motivated, as demonstrated by the fact that none dropped out of the exercise, but the use of tablets to collect data and the ability to remotely supervise the enumerators' actions improved data quality dramatically. Te tablets registered the data and the time of data collection, along with the GPS coordinates of where the data were entered. Tis para-data allowed the frm supervising the data collection to assess whether the enumerators had indeed visited households for interviews, and to assess the average response time. Te use of Computer-Assisted Personal Interviewing


**Table 1** Response rate (percentage of households that answered the survey)

*Source* Authors' calculation using data from the Permanent Monitoring System

(CAPI) thus solved an important supervision problem that might otherwise have afected the quality of data collected by local enumerators operating under limited supervision.

To facilitate data collection using tablets, enumerators were trained in the use of CAPI techniques. Diferent questionnaires (for households, schools, clinics, and district leaders, and price questionnaires) were programmed in CSPRO, and a server was installed in the ofce of the frm supervising the work. Using CAPI allowed enumerators to send data to Bamako as soon as they completed a survey and had access to the internet. Tough phone network coverage is limited in northern Mali, the network exists, at least in the urban center of each district. It was agreed that at least once a week, enumerators would move to a location that had network coverage to transfer their data to the server. At the beginning of each month, when enumerators were within reach of the phone network, they were paid using mobile payment systems such as Orange Money. Enumerators also downloaded new or updated questionnaires at these times. Relying on CAPI thus allowed the team to dynamically change the questionnaires used. Core questions typically did not change, but the questionnaires were adapted regularly to respond to new requests for information from development agencies and the government. Questionnaires were also changed in response to events on the ground and enumerators were expected to report noteworthy events, the distribution of material to farmers and breeders, and the functioning of schools and clinics. Teir feedback was then used to update the questionnaires.

CAPI was not used everywhere: in some villages in Kidal, paper questionnaires continued to be used as respondents had expressed concerns about the use of tablets. Tey feared enumerators might use the GPS capacity of the tablet to order drone strikes. In these few instances, enumerators flled in paper questionnaires and subsequently transferred the responses onto the CAPI system, before electronically sending the responses to the server in Bamako.

Te frm visited each enumeration area every six months for additional supervision, exposing the supervision team to insecurity while traveling, but once in the villages, the team was generally given a warm welcome. Te team would meet with local authorities, including traditional and religious authorities, to (re)explain the objectives of the activity, and to request continued collaboration. Te team also met with citizens at large, stressing how the enumerators were working in the interests of the whole community by striving to collect good information on the issues afecting their villages.

Tese eforts were successful. Quality data was collected throughout the entire period and for more than one year, the PMS informed the government of Mali and international organizations on changes in the situation in northern Mali. Best of all, none of the enumerators were harmed, nor was any survey respondent afected by violence that could in any way be associated to the survey.

# **3 Key Results**

From September 2015 to January 2016, 35 enumerators covered 672 households across 56 villages and city areas, administering the fve different types of survey instruments. Some key results are presented below. Food insecurity was found to mostly afect households in Gao and particularly in the early months of the year, when more than one-quarter of Gao's citizens lived in a state of food insecurity; this declined to around one-ffth between March and October. In Kidal, food insecurity was found to be much less of an issue, with less than 10% of households living in a state of food insecurity throughout 2016. In January 2017, however, food insecurity became a more serious issue in Kidal, and 19% of households were afected. In Timbuktu, few

**Fig. 2** Percentage of households living in a state of food insecurity (*Source* Authors' calculations based on data from the Permanent Monitoring System)

**Fig. 3** Perceptions of security (*Source* Authors' calculations based on Mali Permanent Monitoring System)

were afected by food insecurity throughout the duration of the survey (Fig. 2).

Despite the Peace Agreement, the surveyed households' sense of security decreased considerably during 2016. Between January and December 2016, the percentage of the population who were comfortable at home at night decreased from 79 to 47% in Gao, from 91 to 8% in Kidal, and from 74 to 63% in Timbuktu. Te pattern was the same for feelings of security during the day: Between January and December 2016, the percentage of population who felt secure going out alone during the day decreased from 81 to 64% in Gao, from 95 to 13% in Kidal, and from 72 to 69% in Timbuktu (Fig. 3).

**Fig. 4** Confdence in the government and the judicial system (*Source* Authors' calculations based on Mali Permanent Monitoring System)

Following the confict, confdence in government was low, especially in Kidal, where less than 20% of the population was found to have confdence in the Malian government. Confdence levels barely changed throughout 2016. In Gao, the percentage of the population having confdence in the government never exceeded 70%. In Timbuktu, levels of confdence were generally higher, but they fuctuated quite considerably over time. Confdence levels were not much diferent in terms of the judicial system, with particularly low levels in Kidal, higher in Gao, and highest in Timbuktu, where over the course of 2016, confdence in the judicial system decreased from 70 to 51% from January to December (Fig. 4).

Te patterns of confdence in the government and the judicial system carry over with respect to trust in people from other ethnic groups and foreigners. In Gao and Timbuktu, the percentage of the population with trust in people from other ethnic groups was relatively high compared to Kidal, but decreased over time, from 77% in Gao and 75% in Timbuktu in January 2016, to 71% in Gao and 56% in Timbuktu in December 2016. In Kidal, levels of trust were found to be signifcantly lower, at around 40% and falling during 2016. Trust in foreigners was virtually non-existent in Kidal, at less than 10%, but much higher (and rising) in Timbuktu and Gao. In contrast, the population across all three locations was found to have high levels of confdence in religious and traditional authorities. Te quasi-totality of the population in

**Fig. 5** Confdence in people (*Source* Authors' calculations based on Mali Permanent Monitoring System)

the three regions indicated that they had confdence in religious leaders, and the same pattern held true for traditional leaders in Gao and Timbuktu, but not in Kidal, where confdence in traditional leaders was found to be much lower, at 63% in January 2016, and declining over time (Fig. 5).

During 2016, the problems faced by healthcare centers remained largely unresolved. Some problems, such as the lack of medication and lack of staf, even increased between January and December 2016, although staf absenteeism declined considerably over the same period. Other problems, such as the lack of infrastructure, became less pressing, but generally speaking, very limited progress was made in restoring health services. Te state of schools was similar. Te lack of teachers decreased from 24 to 16% over 2016, as did the lack of school materials, declining from 24 to 20%. However, other issues became more pressing, including the lack of classrooms and the absence of school feeding (Fig. 6).

**Fig. 6** Problems reported by health facilities and schools (*Source* Authors' calculations based on Mali Permanent Monitoring System)

## **4 Lessons Learned and Next Steps**

Implementation of the PMS was surprisingly straightforward, not least because World Bank staf collaborated closely with a high-quality survey frm with experience in northern Mali. Two major challenges, the hiring of enumerators and ensuring data quality, have been discussed already. A third challenge proved to be fnancing. While the data produced were well-received and in demand, and even though each round of data collection was relatively inexpensive, after 15 months of continuous data collection, the team failed to identify the funding needed to continue the exercise.

Fortunately, an alternative was identifed. While fnancing for generalized data collection proved hard to fnd, funding for third-party monitoring of project activities was available. Te mix of terrorism and armed violence rendered feld supervision by donor representatives impossible. At the same time donors desired to invest more in the north to support the peace process. Because donor representatives were not able to visit project sites in northern Mali, they started to rely on third-party monitors. Often, these are local NGOs that are also involved in reconstruction activities (raising concerns about conficts of interest), or specialized outsider frms with a higher risk appetite, at a commensurate price. Irrespective of their nature, these third-party monitors collect information, for example on the progress of a construction project, which is a task familiar to the resident enumerators used for this project. Te resident enumerators were thus retrained to act as third-party monitors. Relying on local enumerators for thirdparty monitoring is new, and the World Bank is testing this approach against an alternative of visits by experts from local NGOs. Tis is ongoing, but if the results of the continuous data collection in northern Mali ofer any guidance, it seems likely that the local enumerators, equipped with tablets, cell phones, and motorbikes, will be able to provide quality data at a fraction of the cost that is usually paid for thirdparty monitoring.

## **Annex: Evolution of Security and Economic Indicators**



**5 Resident Enumerators for Continuous Monitoring 79**



*Source* Authors' calculations based on the Mali Permanent Monitoring System

**5 Resident Enumerators for Continuous Monitoring 81**

# **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **6**

# **A Local Development Index for the CAR and Mali**

**Mohamed Coulibaly, Johannes Hoogeveen, Roy Katayama and Gervais Chamberlin Yama**

# **1 The Data Demand and Challenge**

Te Central African Republic (CAR) has been afected by repeated cycles of violence and confict. A landlocked country in Central Africa, with an area of about 620,000 square kilometers and an estimated population of around 4.9 million, the CAR is sparsely populated. Despite a wealth of natural resources such as uranium, crude oil, gold, diamonds, cobalt, lumber, wildlife, and hydropower, as well as signifcant quantities of arable land, the CAR is among the ten poorest countries in the

R. Katayama e-mail: rkatayama@worldbank.org

M. Coulibaly · J. Hoogeveen · R. Katayama (\*) · G. C. Yama World Bank, Washington, DC, USA

e-mail: mcoulibaly2@worldbank.org

J. Hoogeveen e-mail: jhoogeveen@worldbank.org

G. C. Yama e-mail: gyama@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_6

world. According to the Human Development Index, the country had the lowest level of human development in 2016, ranking last out of 188 countries.

Te latest bout of insecurity started in late 2012 with a Séléka insurrection in the north of the country. Tis led to three years of violence, destruction of property, great human sufering, and left an estimated one-ffth of the population displaced. In May 2015, the Bangui Forum was organized to discuss the country's peace-building program, and to pave the way for elections. After another major outbreak of violence in September 2015, the country successfully held presidential and legislative elections in early 2016 and induced a lull in the confict.

Despite the reduction in confict, the country remained insecure even after the elections. More than a dozen armed militias remain active in the country today, controlling most of the country's territory. Tese armed groups are pursuing a wide spectrum of objectives. Te Anti-Balaka, which arose from village-based self-defense groups, and the Union for Peace in the CAR (UPC), comprised mostly of Fulani cattle herders with the aim to protect transhumance corridors, have a strong focus on community protection. Te Lord's Resistance Army (LRA), on the other hand, has no territorial or ethnic ties in the CAR, and uses the country as a safe haven and source of revenue through looting. Te Popular Front for the Renaissance of the CAR (FPRC), by contrast, is active in the northern regions of the country and is closer to Chad. Te United Nations peacekeeping mission (MINUSCA) operates among these armed groups. Tis mission, although unpopular, remains essential given the inoperative national defense and security forces, and the lack of state presence throughout the country.

To forge a national consensus on the country's needs and priorities for the frst fve years of the post-election period, in May 2016 the government of the CAR requested support from the European Union, the United Nations, and the World Bank Group to prepare a Recovery and Peacebuilding Assessment (RPBA). Tose preparing the assessment were in urgent need of up-to-date information about the country, and requested data that could inform the planning of recovery activities and serve as a baseline for a monitoring system. Te challenge was made greater by the fact that the new data and analytical results were needed by September 2016, leaving only three months to prepare and complete the data collection. Moreover, the rainy season was about to start and road infrastructure was in poor condition.

Household surveys take time to design and implement, and a typical welfare survey takes more than a year to prepare, feld, and analyze. It was clear that a more adapted solution would be needed. To complicate matters further, the confict had left the country's statistical system, which had been reasonably developed prior to 2012, in poor shape. Many staf of the national statistical institute (ICASEES) had left, its ofces had been pillaged, and much of the country's statistical memory had been wiped out. Te existing sampling frame was outdated and no longer reliable given that entire villages had vanished, and 20–25% of the population was displaced.

## **2 The Innovation**

When considering the request for new statistical data, the team realized that given the precarious security situation, travel would need to be minimized. At the same time, disparities between Bangui and the country's periphery had been recognized as one of the drivers of the confict, and thus collecting information nationwide was imperative. Donors also made it clear that poverty estimates should be updated as insecurity and massive internal displacement had made the existing poverty estimates less relevant for decision making; as such, new poverty maps were needed that could be used to target interventions. It was evident, however, that it would be impossible to feld and analyze a consumption survey within the given timeframe. Moreover, in the absence of a reliable sampling frame, such a survey would not constitute value for money.

Te 2008 poverty numbers showed that even before the crisis, poverty in the CAR was pervasive. Poverty levels were estimated at 66% of the population, based on the international poverty line of US\$1.90 per day in 2011 purchasing-power parity terms. Since that time, the country's gross domestic product (GDP) per capita fell by one-third, and recent estimates suggest that the poverty rate surged to more than 76% in 2015. When almost everybody is poor, further refning the number of people living in poverty is of limited value, and means-based targeting is not a key priority. Instead, identifying what had to be targeted where was of greater importance.

Instead of producing a poverty map, the team decided to map the state of the nation by making a rapid assessment of the public services that were available. Drawing from the experience of Mali's Indice de Pauvrete Communale (District Poverty Index, IPC), a district census was designed for the CAR, called the *Enquête Nationale sur les Monographies Communales* (ENMC).1,2,3 Enumerator teams would interview representatives and other district leaders from each of the 179 districts, the lowest administrative unit, in the country, using a structured questionnaire.4 Since it was clear that in many locations ofcials were absent, and to avoid nonresponse because of this, the enumerator manual did not prescribe which ofcials had to answer, only that a group of ofcials who were knowledgeable about the district capital (*chef-lieu*) and the district's largest villages had to be identifed. While this strategy was successful in that information was eventually obtained for every district, very detailed information from specialists could not be collected and questions had to remain relatively general.

Te district census collected information on conditions in all districts across the country, including on local infrastructure, access to information (radio, television, and phone network), health and education facilities, local governance, economic activities, confict, security, and violence, and local perspectives on security and policy priorities. On the basis that respondents would have more accurate information on their immediate environment, the questionnaire focused primarily

<sup>1</sup>Observatoire du Développement Humain et Durable (ODHD), 2008. Profl de pauvreté des districts de Mali.

<sup>2</sup>While in this chapter the district census is emphasized, the ENMC also had a household survey component. More on this in Sect. 4.

<sup>3</sup>Te instruments, data, and analysis of the *Enquête Nationale sur les Monographies Communales* (ENMC) can be downloaded from: http://bit.ly/2k7wFlq.

<sup>4</sup>Te administrative divisions in the CAR are as follows: (1) prefecture, (2) sub-prefecture, and (3) district, referred to as *commune* locally. Te 8 administrative subdivisions of the capital city, Bangui, were treated as districts in the ENMC. Te district census was carried out at the district level in the CAR.

on the situation in district capital to improve the reliability of the data collected. In addition, district ofcials were asked to list the ten largest localities in their district outside the district capital, and to indicate the presence of schools, health facilities, water points, electricity, mobile phone networks, refugees and displaced people, transport opportunities, and markets in each of these localities.

A district census had several advantages. Districts are the smallest administrative divisions in the CAR, and are thus at the forefront of service provision. No sampling was required, as all 179 districts were to be covered. Te small number of observations needed for this census had other advantages. Logistical complexity was reduced, and only a small number of enumerators had to be trained and supervised. Data collection and data entry were fast, and analysis and reporting were straightforward and visually appealing, as much of the information collected could be presented in the form of maps. Last but not least, the overall cost was small,5 facilitating regular repeats of the ENMC and thereby ensuring that the RPBA's request to create the basis for a monitoring system could be fulflled.

To facilitate decision-making, information collected in the district census was refected in the Local Development Index (LDI). Tis composite index combines a range of policy-relevant indicators into a single measure. It thus sheds light on district conditions in a straightforward and easy-to-understand way. Moreover, by covering the entire country, the LDI could serve as an alternative to a poverty map, with the added advantage that all the tracked indicators are actionable by decision makers. Tis allows decision makers to identify which districts are in greatest need of additional investment. Decision makers can also use LDI scores as a basis for budget allocations, with underprivileged districts receiving larger per capita allocations, thus facilitating the process of decentralization.

Te indicators used to construct the LDI fell into three categories: local administration, infrastructure, and access to basic services.

<sup>5</sup>It cost US\$180,000 to design, feld, and analyze the ENMC. Tis covered the district census as well as the associated household survey.

Local administration was captured through indicators such as budget per capita (in local currency) allocated to the district, number of working staf at the local district government ofce, and presence of security forces (gendarmerie and police). Te second pillar assessed the availability of basic infrastructure, including the presence of a mobile phone network and a banking system, and the transport cost per kilometer, as a proxy for mobility costs across the country. Te third pillar measured the availability of basic services, such as public primary schools, health centers, sanitation systems, and clean water. Tese three pillars constitute the overall LDI. As there is no objective way for the diferent pillars to be weighted, and to keep the results tractable, each pillar was equally weighted in the fnal score: that is, the weight for each pillar was one-third. Within each pillar, some indicators were given a higher importance than others, and were therefore attributed diferent weights; however, each cluster of sub-activities, particularly health, education, and water, were assigned equal weights. Details of the weighting scheme are shown in Table 1.

# **3 Key Results**

Te district census brought the characteristics of diferent areas that are critical for development planning into a single database. It presented information about the agro-ecological zone, the main and secondary sources of income, the main crops grown, and whether there were any mining activities in each district. Te census collected information about the presence of displaced people and whether NGOs were active in an area. It also collected information on infrastructure, such as roads and electricity, and service delivery, such as schools and health centers. Finally, the perceptions of local ofcials were collected on their development priorities and how the current situation difered compared to six months earlier.

Te district census confrmed the dismal state of development in the CAR, and demonstrated the considerable variation that exists across districts. District administration ofces were found to be understafed and short of funding. In most districts, security personnel (police and gendarmes) were absent, and only 24 districts had 20 or more staf in the municipal ofce, with regular payment of municipal staf remaining a


**Table 1** Local Development Index: components and weights

*Source* Authors' visualization

problem. Moreover, 57 districts indicated not having received a budget allocation for 2016.

Access to infrastructure, including electricity, mobile phone coverage, banking services, and road networks, was found to be low. Only 15% of districts reported having electricity or some form of public lighting in the district capital, and only one of the 101 district capitals located in a rural area was found to be connected to the national electricity grid. Overall, only four in ten district capitals had at least one mobile phone provider in the district capital. Furthermore, only one in ten district

*dŚĞĚŝƐƚƌŝĐƚĐĞŶƐƵƐĞůŝĐŝƚĞĚŝŶĨŽƌŵĂƚŝŽŶĂďŽƵƚůŝǀĞůŝŚŽŽĚƐ͕ĚĞŵŽŶƐƚƌĂƚŝŶŐŚŽǁĂŐƌŝĐƵůƚƵƌĞĚŽŵŝŶĂƚĞƐƚŚĞ ĞĐŽŶŽŵLJǁŚŝĐŚŚĂƐůŽǁůĞǀĞůƐŽĨĞĐŽŶŽŵŝĐĚŝǀĞƌƐŝĨŝĐĂƚŝŽŶ͘*

*/ŶƚĞƌǀŝĞǁƐǁŝƚŚůŽĐĂůŽĨĨŝĐŝĂůƐǁĞƌĞƵƐĞĚƚŽ ĞůŝĐŝƚƚŚĞŝƌĚĞǀĞůŽƉŵĞŶƚƉƌŝŽƌŝƚŝĞƐ͘*

*ŝƐƚƌŝĐƚƌĞƉƌĞƐĞŶƚĂƚŝǀĞƐ͛ƉĞƌĐĞƉƚŝŽŶƐŽĨĐŚĂŶŐĞƐŝŶƐĞĐƵƌŝƚLJĂŶĚƐŽĐŝŽͲĞĐŽŶŽŵŝĐĐŽŶĚŝƚŝŽŶƐŝŶƚŚĞƉĂƐƚƐŝdž ŵŽŶƚŚƐ͘*

**Fig. 1** Selected results from the district census (*Source* Authors' calculations based on the CAR District Census/ENMC)

capitals had some form of banking system, either a bank or a local credit union. Half of the districts reported that roads to Bangui were not accessible throughout the year (Fig. 1).

#### *>ŽĐĂůĂĚŵŝŶŝƐƚƌĂƚŝŽŶ͗&ƵŶĚŝŶŐĂŶĚƐƚĂĨĨŝŶŐŝŶĚŝƐƚƌŝĐƚƐ*

*dŚĞƌĞǁĂƐůŽǁĐĂƉĂĐŝƚLJŝŶůŽĐĂůŐŽǀĞƌŶĂŶĐĞ͕ĂƐĚŝƐƚƌŝĐƚƐůĂĐŬĞĚƐƚĂĨĨĂŶĚĨƵŶĚŝŶŐ͘dŚŝƐǁĂƐĐŽŵďŝŶĞĚǁŝƚŚƚŚĞ ĂďƐĞŶĐĞŽĨŐĞŶĚĂƌŵĞƌŝĞĂŶĚƉŽůŝĐĞĨŽƌĐĞƐŝŶŵĂŶLJĚŝƐƚƌŝĐƚƐ͘*

*>ŽĐĂůŝŶĨƌĂƐƚƌƵĐƚƵƌĞ͗DŽďŝůĞƉŚŽŶĞĐŽǀĞƌĂŐĞ͕ďĂŶŬŝŶŐƐĞƌǀŝĐĞƐ͕ ĂŶĚƌŽĂĚƐ*

*ƐƐĞŶƚŝĂůŝŶĨƌĂƐƚƌƵĐƚƵƌĞʹ Ğ͘Ő͘ŵŽďŝůĞƉŚŽŶĞĐŽǀĞƌĂŐĞ͕ďĂŶŬŝŶŐƐĞƌǀŝĐĞƐ͕ƌŽĂĚƐʹ ǁĂƐůĂĐŬŝŶŐŝŶŵĂŶLJĚŝƐƚƌŝĐƚƐ͘*

#### *ĐĐĞƐƐƚŽďĂƐŝĐƐĞƌǀŝĐĞƐ*

*ĐĐĞƐƐƚŽďĂƐŝĐƐŽĐŝĂůƐĞƌǀŝĐĞƐ͕ƐƵĐŚĂƐƉƌŝŵĂƌLJƐĐŚŽŽůƐĂŶĚŚĞĂůƚŚĐĞŶƚĞƌƐ͕ǁĂƐĨŽƵŶĚƚŽďĞůŝŵŝƚĞĚ͘*

**Fig. 2** Selected results on local administration, infrastructure, and access to services (*Source* Authors' calculations based on the CAR District Census/ENMC)

Access to basic social services such as public primary schools, health centers, and clean water was limited, especially outside district capitals. In the ten largest localities of each district, only 43% had a functional public primary school, 18% had a functional health center, and 43% had access to clean water sources. Access to clean water and sanitation systems were found to be limited even in the district capitals, where only 36% of the districts reported having clean water access points in their capitals (Fig. 2).

Te LDI was constructed using the approach described in Table 1, shedding light on current conditions in a simple and straightforward way. Te LDI score was low for most districts, indicating the need for substantial improvements across the country. Among the three pillars that form the LDI, local infrastructure varied more across districts, whereas access to basic services was relatively homogeneous. Compared to other districts in the country, those in Region 1, Region 2, and Region 7, which correspond to the capital and southwestern region of the country, were more likely to be in the top quintile of the LDI.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

Te ENMC demonstrated the feasibility of collecting nationwide information relevant to decision makers, both rapidly and in a cost-efective manner. Te data informed project preparation and fed the RPBA monitoring system. Results have been widely disseminated, and representatives in each district have received posters showing how they perform relative to other districts in the country (Fig. 3). Te district census will be repeated annually to track progress.

Because the main cost of most surveys is the transport cost for enumerators to physically reach the survey locations, the district census was supplemented by a light household survey at a marginal cost. Te survey was considered 'light' in the sense that no detailed consumption data were collected. Sampling for the household survey took account of the fact that traveling throughout the country was still dangerous, and time was limited *dŚĞ>/ƐĐŽƌĞŝƐ ůŽǁĨŽƌĂůĂƌŐĞƐŚĂƌĞŽĨĚŝƐƚƌŝĐƚƐ͕ďƵƚĚŝƐƚƌŝĐƚƐůŽĐĂƚĞĚŝŶƚŚĞƐŽƵƚŚͲǁĞƐƚƌĞŐŝŽŶƐŚĂǀĞƌĞůĂƟǀĞůLJ ŚŝŐŚĞƌ>/ƐĐŽƌĞ͘*

**Fig. 3** Local Development Index across districts (*Source* Authors' calculations based on the CAR District Census/ENMC)

for data collection. Given these concerns, in addition to high transport costs, an unorthodox sampling design was selected in which ten households were interviewed in each district where fve households were randomly selected from a randomly selected neighborhood of the chef-lieu, and fve households were randomly selected from a randomly selected village located 20–40 kilometers from the chef-lieu. In each of the selected localities, a simple listing of households was completed, up to a maximum of 100 households, from which the fve households were selected.

Te survey was designed such that a team of two enumerators and a driver could collect all the information from one district within two days, allowing for speed of data collection, and reducing costs and exposure to risk. Tis strategy was successful. District ofcials from all 179 districts were interviewed, and in the end, households in only two districts could not be interviewed because the situation was too dangerous. Ofcials from these two districts were interviewed in neighboring locations. A total of 1767 households were interviewed.

Te household survey served as a valuable complement to the district census. It allowed diferences in perceptions and priorities for development between citizens and their representatives to be investigated, and the results show that these diferences were minimal. Repeating both the household survey and the district census will aid in understanding whether improvements in service delivery as reported by district

#### **94 M. Coulibaly et al.**

*Those with poor food consumpon tend to be less wealthy and located in the two northern agro-ecological zone, which overlap with the Fert, Yada, and Plateaux regions.*

**Fig. 4** Food consumption by wealth and agro-ecological zone (*Source* Authors' calculations based on the CAR District Census/ENMC)

representatives match improvements in outcomes, such as education and health, reported by households.

Te household survey further allowed for the collection of information about wealth, displacement, the experience of shocks, the impact of the crisis, and food security. Using a concept borrowed from the World Food Programme, the Food Consumption Score (FCS) was calculated using information about the frequency with which nine diferent types of food had been consumed by the household in the past seven days.6 Te FCS was then used to explore which households found themselves in one of three categories: severely food insecure (poor), moderately food insecure (borderline), or food secure (acceptable) (Fig. 4).

In light of the situation, the ENMC was a success: the three-month deadline was met and a set of valuable data were generated, which informed and continues to inform decision makers. In an FCV context where state presence is limited and/or contested, the mere fact of collecting data nationwide contributed to a sense of equal treatment among districts and a feeling of belonging to one nation state.

<sup>6</sup>World Food Programme 2008. Food consumption analysis calculation and use of the food consumption score in food security analysis.

Tis was important. Te census was among the very few public initiatives which were successful in covering the entire territory and to which the Government could point as evidence of its commitment to all citizens across the nation's territory. A sample survey would not have had this intangible beneft.

With the beneft of hindsight, some aspects of the process could have been improved. More time could have been spent on developing the district census questionnaire, thus avoiding the need to change the contents of the questionnaire in its second wave (felded in 2018) when the authorities were warming up to the idea of an LDI. On the other hand, once the initial LDI was constructed, it proved to be much easier to convince ofcials to substantively contribute to discussions about what it should entail.

While the team remains generally satisfed with the data collected by the household survey, it would have been advantageous if more households could have been interviewed in some areas. Bangui, the capital city, is comprised of eight administrative subdivisions (arrondissements), and the 78 households that were interviewed in Bangui were too few to support detailed reporting for the capital city. Tis also holds for some of the northeastern prefectures, which comprise very few districts and consequently, an insufcient number of observations were collected to support more disaggregated reporting. In addition, while the survey collected information from displaced people who were residing with extended families, camps for Internally Displaced People (IDP) were not covered by the survey.

Most importantly, the experience of ICASEES, the national statistical ofce, in felding a survey in hard-to-reach and insecure areas was invaluable. Enumerators were given vests and their cars mounted with fags that demonstrated clearly that they worked for ICASEES, giving them some degree of protection from armed groups. Furthermore, enumerators assigned to at-risk areas were paid slightly more to motivate them to go and to avoid adverse selection in which the least experienced enumerators go to the most difcult areas. Trips were carefully planned, taking into account the type of infrastructure available and the appropriate means of transport. Where needed, motorbikes or boats were used.

Teams traveling into areas considered highly insecure were in regular contact with their team leader in Bangui. Although overall mobile phone coverage was limited to urban centers, it allowed teams to be followed closely as they moved from one location to the next. In some cases, teams borrowed radios from the UN or NGOs to contact supervisors. In addition, prior to the deployment, teams were trained to contact armed group leaders before entering areas controlled by them and to inform them about the data collection activity. Once in the area, the teams would pay a visit to these armed group leaders to seek their authorization in the form of a laisser-passer letter or stamp indicating their support for the activity. Tis allowed the teams to work in relative security, and the teams were escorted from the armed group in some cases in return for a small token of appreciation.7

Paper questionnaires were used, as tablets or smartphones were deemed too attractive to armed groups. UN fights were used to access hard-to-reach areas, where the teams often had to hire transport from local strongmen, giving them implicit protection. Teams received pocket money to be used at roadblocks to ensure safe passage. Tese measures turned out to be efective. Not only were all data collected in less than four weeks, but all teams returned to Bangui safe and unharmed.

#### **Box 1 LDI in Mali allows for comparisons across time and space**

For over a decade, Mali has conducted commune censuses which are similar to the Central African Republic (CAR) district census. While the CAR district census data are summarized in a Local Development Index (LDI), Mali's four commune censuses are used to compute the Indice de Pauvrete Communale (Commune Poverty Index, IPC) and poverty quintiles which are subsequently used in budget allocation formulas. The IPC is based on a principal component analysis (PCA) which is redone for every census, making the IPC noncomparable from one census to another.

<sup>7</sup>Clearly the presence of an escort by an armed group may have infuenced responses to certain questions. Teams had been instructed beforehand to make sure that if they had an escort, armed group representatives would not be present at the interview.

Taking advantage of the CAR experience, an LDI has been developed for Mali which allows comparisons of commune development across space and time. As in CAR, the LDI focuses on three aspects of development: local administration capacity, presence of infrastructure, and service delivery. These aspects have an equal weight of 33% and their sub-indicators are also equally weighted. The sub-indicators are common across all available commune censuses.

The LDI's defnitions remain unchanged from one census to another and are comparable over time. The LDIs are positively correlated with the IPCs, and negatively with local poverty estimates. Because the LDI and CPI indices are ordinal, meaning that a lower value is associated with being poorer (IPC) or less developed (LDI), the (monotonic) relationship between them can be assessed using Spearman's correlation coeffcient.8 For the three frst censuses for which both indices are available (2006, 2008, and 2013) the correlation coeffcient lies above 0.65 with a *p*-value close to zero, suggesting a strong and statistically signifcant positive relationship. There is also a negative relationship between the LDI and the individual poverty rate (headcount ratio) of communes. The availability of a poverty map for Mali for 2009 made it possible to assess this relationship. Communes were grouped in two poverty categories depending on whether their poverty incidence was higher (frst group) or lower (second group) than the national one. The LDI for poorer communes is signifcantly lower. Moreover, the LDI of the poor communes is lower than the national LDI average, which in turn is lower than the average LDI of the second group.

The new LDI is a useful tool for the analysis of development trends in Mali. For instance, looking at the regional LDI evolution between 2006 and 2017, Fig. 5 indicates that the communes in the region of Mopti and Segou had the highest increases in LDI (+76 and +61% respectively), while communes in the region of Kidal and Bamako had the lowest (+4 and +1% respectively). The big difference between Kidal and Bamako is that Bamako started at a very high base level, whereas Kidal started from a very low level. The LDIs show that before the crisis, the three northern regions were among the least developed in the country—lending support to grievances by the northern population about neglect by the central government. Broken down by livelihood zone, one notes that the progress in LDIs is strongly associated with crop production, and much less with nomadism and pastoralism.

The new index provides insight into the development dynamics of communes in the country. Figure 6, for instance shows the scatter plot of the

<sup>8</sup>A positive (negative) monotonic relationship between two variables is a relationship doing the following: as the value of one variable increases, the other variable value increases (decreases).

LDI 2006 and LDI 2017 by region. It shows how the LDI for most communes improved (the dots lie above the 45-degree line), with the exception of Kidal and Tombouctou where a substantial fraction lies below the 45-degree line. The map demonstrates that almost all the worst performing communes can be found in the northern part of the country, and particularly in the North-East.

Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part II**

# **Methodological Innovations**

# **7**

# **Methods of Geo-Spatial Sampling**

**Stephanie Eckman and Kristen Himelein**

# **1 Introduction**

Technological advances in geospatial data have the potential to change how survey data are collected. Long hampered by high costs, limited capacity, and difculties in supervision, sample selection is often done using second-best or nonprobability approaches. As geospatial technology has improved and become more widespread, costs have come down and the number of available tools have increased, making Geographic Information Systems (GIS)-based sampling approaches accessible to more users. Tis chapter presents experiences with GIS-based sampling from three diferent settings: (i) where no sampling frame is present because the census is outdated; (ii) sampling pastoralist communities; and (iii) rapid listing of enumeration areas to reduce exposure of feld

K. Himelein (\*) World Bank, Washington, DC, USA e-mail: khimelein@worldbank.org

S. Eckman

RTI International, Washington, DC, USA

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_7

teams. Te case studies focus on extreme situations, particularly those in confict-prone areas, as innovation often takes place when few other options are available. Te applications discussed here, however, are applicable to many less extreme situations.

# **2 Data Challenge and Innovation #1: Creating a Sampling Frame in the Absence of a Census**

For many studies, no sampling frame of the target population is available. Te most common approach to addressing this problem for large-scale household surveys in the developing world is to use a stratifed two-stage design. In the frst stage, census enumeration areas are selected as the Primary Sampling Unit (PSU), using probability proportional to estimated size. In the second stage, a household listing operation is conducted in the selected PSUs, and households are selected using simple random sampling.1,2 With this approach, even outdated census data can be used to select PSUs, as long as a high-quality listing operation is done in the selected PSUs to create a sampling for the second stage selection of households. Using out-of-date census data as a measure of size in PSU selection will result in estimates that are inefcient but still unbiased. However, some countries do not have census records at all because of accessibility issues, war, or natural disasters. In these situations, newly available high-resolution satellite data can be used to generate estimated population densities and to demarcate PSU boundaries. Te two examples discussed here are from surveys conducted in rural Somalia and Kinshasa, Democratic Republic of the Congo (DRC).

<sup>1</sup>For the purposes of this chapter, the word "dwelling" is used to denote a physical structure inhabited by one or more households, while a "household" is a group of individuals that function as an economic unit. All methods that select dwellings which have the possibility of containing multiple households have selection protocols to randomly select an individual household for interview.

<sup>2</sup>Grosh and Muñoz (1996).

In Somalia, the last population census, carried out in 1975, measured the population at 3.9 million. Current estimates for the country indicate a population of more than 14 million. For the DRC, similarly, the last census was carried out in 1984, at which time the population was around 29 million. Current population estimates are now over 77 million. As noted above, it would still be possible to use the outdated census for estimated population totals if there was an expectation of approximately constant growth across regions. Both Somalia and DRC, however, have experienced signifcant civil strife, including large-scale displacements of the population.

Some countries, notably Haiti following the 2010 earthquake, have used "quick counts" to collect information about where the population lives and to estimate its size. In a quick count, enumeration areas are randomly sampled and listed, then the results are used to build a model to update census counts in the remaining areas.3 However, in Haiti, the most recent census was only seven years old at the time of the earthquake, and the damage and population movements were relatively concentrated. Te more time that has elapsed since the last census, the more difcult it is to develop an accurate model of the current population based on quick counts. Moreover, the DRC has a land area nearly 85 times the size of Haiti, which makes using a quick count methodology impractical from the perspective of both cost and implementation time. In Somalia, in addition to ongoing insecurity in certain areas, the enumeration area estimates from the 1975 census were never published, and the full results are thought to be lost. Terefore, alternative approaches to selecting a household sample were needed in both Somalia and the DRC.

### **2.1 Innovations**

Tree approaches were implemented across the two surveys. For the Somali High Frequency Survey (SHFS), rural areas posed a challenge for the creation of a sampling frame. Rural areas were defned as non-urban permanently settled areas but excluding Internally Displaced

<sup>3</sup>IHSI et al. (2012).

Persons (IDP) settlements.4 To create a frame for the frst selection stage, a gridded population approach was developed in collaboration with Flowminder.5 Rural areas that were secure enough for data collection were divided into 100 by 100 meter grid cells. For each cell, WorldPop data provided an estimated population size.

Neighboring cells were then combined to form PSUs, using a quadtree algorithm, which combines cells to meet specifed criteria, in this case, area and population size.6 Te maximum area was set at 3 by 3 kilometers, and the maximum population was limited to 3500 to keep enumeration areas manageable for feld teams. Te left panel of Fig. 1 shows the PSUs created by the above steps, with the color indicating the estimated population in each one.7 Next, a sample of PSUs selected using probability proportional to estimated size. Te selected PSUs were then further subdivided into segments. If the selected PSU contained 12 or fewer dwellings based on satellite imagery, only one segment was defned. For those PSUs containing between 13 and 150 dwellings, 12 segments were defned, with additional segments being defned for PSUs with more than 150 dwellings.

A major disadvantage of the grid approach described above is that the boundaries of the resulting PSUs do not follow natural boundaries such as roads, valleys, and rivers. Te cells' artifcial boundaries complicate feld implementation. Aware of this constraint, the team initially pursued an alternative methodology in which the WorldPop distribution was used to randomly select points to serve as "seeds" for PSUs, which were then grown until they reached an estimated population of around 150 dwellings but without crossing natural boundaries.8 Unfortunately,

<sup>4</sup>In urban areas, boundaries and population estimates were available from the United Nations Population Fund's Population Estimation Survey. Boundaries of IDP settlements were provided by United Nations High Commission for Refugee's Shelter Cluster.

<sup>5</sup>Closely following the methodology by Muñoz and Langeraar (2013).

<sup>6</sup>See Samet (1984), for a description of the methodology, and Minasny et al. (2007), for an application of the methodology to sample design.

<sup>7</sup>Te map shows both urban and rural areas. Urban areas were not subject to the same population or land area limits.

<sup>8</sup>Tomson et al. (2017).

*WŽƉƵůĂƟŽŶĨŽƌĐŽŶƐƚƌƵĐƟŽŶW^hƐŝŶ^ŽŵĂůŝĂ ƵŝůĚŝŶŐĐůĂƐƐŝĮĐĂƟŽŶŝŶƚŚĞĐŝƚLJŽĨ<ŝŶĚƵ͕Z*

**Fig. 1** Building classifcations (Color fgure online) (*Source* Authors' calculation)

two major drawbacks became immediately apparent: the development of algorithms to detect natural boundaries was expensive and time-consuming, and selection probabilities were not straightforward to calculate because of boundary efects (seeds near boundaries could grow in fewer directions than others). Te team therefore reverted to a gridded approach but manually adjusted segments to follow natural boundaries to mitigate potential implementation issues.

In the DRC, two methods were used. In the districts of Kisenso, Kimbanseke, and Mont Ngafula in Kinshasa, and the sites of Kindu, Tchonka, and Basankusu, a one-stage sample of dwellings was selected based on counts of dwellings made from satellite images. In partnership with the frm Satplan Alpha, the project used recent satellite images to count and geo-locate all dwelling units. Tis work was done manually. Team members classifed each building in the satellite images as low-density residential, high-density residential, or non-residential, using their local knowledge of the typical characteristics of dwelling units in the DRC. Tese typical characteristics were locally specifc, varying between cities and between dense inner-city districts, peri-urban zones, and semirural areas on the outskirts. Te main characteristics used to classify structures were architecture, building size and features, roof segmentation, roof design intricacy and height, building orientation, site boundary features, proximity to major streets, street activity, and trafc. Te right panel in Fig. 1 shows the fnal map for Kindu, DRC, with each building classifed as low-density residential (blue), high-density residential (yellow), and non-residential (red). When the counting, geo-locating, and classifcation were complete, each dwelling was assigned a random number, and a sample was selected through a one-stage random draw. If the classifcation was correct, this approach resulted in an equal-probability simple random sample of dwellings.

In the districts of N'djili and Makala in Kinshasa, a two-stage random sample was used.9 PSU boundaries were frst defned using administrative and physical boundaries such as rivers, highways, and secondary and residential roads that would be easily identifable by interviewers on the ground. Te delineation process used an automated iterative approach where PSUs were created and then split or merged based on target population size. Te left panel of Fig. 2 shows a map indicating the manually created PSUs.

Te next step was to estimate the population within each of these PSUs from high-resolution satellite data. First, a Random Forest Regression model was used to estimate population density based on contextual image information (image metrics that incorporate various aspects of surrounding information, rather than single-pixel signature).10 Te model was trained using a sub-sample of building locations.11 Te area and average building density for each PSU was then integrated with land use and land-cover data to adjust the area by the percentage covered with vegetation and then to produce a building count.12

<sup>9</sup>For further information, see Hirn and Rodella (2017).

<sup>10</sup>Implemented using MapPy, a Python library for remote sensing developed by Jordan Graesser. 11Graesser et al. (2012) provides a more detailed description of contextual image information in image processing.

<sup>12</sup>Building counts derived in this way produce comparable results to manual rooftop counts.

*W^hƐĚĞĮŶĞĚďLJŶĂƚƵƌĂůĂŶĚĂĚŵŝŶŝƐƚƌĂƟǀĞ ďŽƵŶĚĂƌŝĞƐ*

*ůŐŽƌŝƚŚŵŝĐĞƐƟŵĂƚĞŽĨƉŽƉƵůĂƚŝŽŶĚĞŶƐŝƚLJ*

**Fig. 2** Boundaries and population densities (*Source* Authors' calculation)

PSUs were selected with probability proportional to this estimated size. A full listing operation was then conducted in the selected PSUs prior to the second stage selection of households. Tis approach leads to estimates with larger variances, and therefore less precise estimates, than the single-stage approach because the resulting sample is clustered.13

#### **2.2 Key Results and Implementation Challenges**

Each of the methods described above produced a sampling frame from which a representative sample was selected. Tere were, however, substantial challenges in Somalia. For the SHFS, 407 PSUs were selected for the survey (320 urban and 87 rural), and 366 PSUs were selected as replacements (251 urban and 115 rural). After selection, the PSUs were overlaid with satellite imagery from Google Earth and Bing to verify the presence of dwellings. Following that process, 53% of rural PSUs and 2% of urban PSUs were discarded and replaced due to having no visible

<sup>13</sup>Eckman, S., and B. West (2016), "Analysis of Data from Stratifed and Clustered Surveys," in Wolf, C., Joye, D., Smith, T., and Fu, Y. (Eds.), *Handbook of Survey Methodology*. Tousand Oaks: Sage, 477–487.

population. In some cases, it was necessary to replace a PSU multiple times before one with visible dwellings was identifed.

Te approach used in the DRC generated more reliable results. Both the single-stage and multi-stage methods yielded results close to what the interview teams found during the listing exercise. Te single-stage approach, which manually located dwellings based on satellite imagery and then drew a one-stage random sample, was applied in three large districts of Kinshasa. Locating individual dwellings on satellite imagery remains a manual task that is both relatively time-consuming and cannot be entirely standardized. While guidelines can be set for identifying dwellings, in practice, judgment calls are often required to (for example) distinguish businesses or separate conjoined structures into multiple dwellings. When selected structures turned out to be businesses, empty or destroyed houses, or other non-dwelling structures, the misidentifed structures were replaced by a randomly selected replacement dwelling. If such misidentifcation is not excessive and does not systematically vary across the sampled area, the sample can be assumed to remain unbiased. However, misidentifcation can increase costs and needs to be monitored closely. Systematic variation in the misidentifcation of households across the sampled area may bias the sample (for example, underrepresenting areas with many high-rise buildings if the true number of dwellings within high-rises is systematically under-identifed in a rooftop count). From a practical point of view, interviewers also sometimes struggled to fnd the selected households in dense areas, because no addresses were available, only a rooftop view with a GPS point. Tis drawback can be mitigated, however, by equipping interviewers with GPS-capable phones and clear walking maps that point out local landmarks and house characteristics to help with identifcation.

Te second approach used in the DRC, which frst defned PSUs and then algorithmically estimated population numbers to allow for an unbiased two-stage selection, posed diferent challenges. First, refning the algorithm that estimates population density is technically more complex than a simple visual count of dwellings based on satellite imagery. Once in place, however, it can quickly create automated population estimates for large areas. A second challenge is the loss of statistical efciency inherent in the two-stage approach. Tird, interviewers carrying out the listing within selected PSUs sometimes struggled to follow PSU boundaries and to distinguish which buildings were within or outside a given PSU. To minimize such problems, it is critical to prepare clear walking maps for interviewers and guidelines on how to deal with overlapping properties.

In the 28 PSUs in the Makala municipality in the Funa district of Kinshasa, both manual counting of residential buildings (the frst method) and the modeling approach (the second method) were used, permitting a comparison between the two methods and the actual number of households identifed in the feld listing. Compared to an actual total of 9322 households recorded by the listing, the manual approach identifed 7489 dwellings, while the modeling approach generated 10,667 dwellings in the same area. Te correlations between the estimated and the actual values at the PSU level were 88.7 and 93.1% for the manual approach and the modeled approach, respectively. Tis important result indicates that the algorithm outperformed manual counting, at least for this application. See Fig. 3 for

**Fig. 3** Listing totals, modeled estimates, and rooftop counts for Makala (*Source* Authors' calculations)

a comparison of the dwelling counts estimated by the two methods with the household totals generated in the listing operation, for the 28 PSUs in Makala.

# **3 Data Challenge and Innovation #2: Sampling Pastoralist Communities**

Livestock ownership serves a diverse set of functions in the developing world, from food source to savings and use as an investment vehicle. Te pastoralist sector, however, has recently come under increasing pressures from several sources, including an increased demand for meat and dairy products from expanding middle classes, climate change, and the loss of traditional pasture land to development. Tose who are among the most vulnerable to these pressures are nomadic and semi-nomadic pastoralist populations, but the transitory nature of their living situation also hampers the collection of high-quality representative data on which to base analyses.

Because many pastoralists lack a permanent dwelling, they are excluded by a traditional two-stage sampling approach. In July and August 2012, the World Bank undertook a survey in the Afar region of Ethiopia to test a novel approach to sampling the general population, including pastoralists.

Te Afar region was selected for the pilot project for several reasons. First, the World Bank has an ongoing relationship with the Ethiopia Central Statistics Agency (CSA), including supporting the implementation of the Ethiopia Rural Socioeconomic Survey, which includes a module on pastoralist issues. Te CSA also has a high-quality existing GIS infrastructure and a relatively high level of training compared to other potential study areas. Tird, the Afar region ofered geographic advantages over other pastoralist areas. It covers a land area of approximately 72,000 square kilometers in the north-east of Ethiopia and is relatively isolated. Well-guarded national boundaries, geographic features, and traditional ethnic hostilities limit the migration of the Afar people outside the boundaries of the region.

### **3.1 Innovations**

Te approach used to ensure pastoralist populations were included was the Random Geographic Cluster Sampling (RGCS) method. In an RGCS design, points (latitude and longitude) are randomly selected, and then a circular cluster of a given radius is created around the central point. All eligible respondents found within this cluster are selected for the survey. Te main advantage of this design is that it captures everyone who is inside the selected circle at the time of the survey, including those who do not have a permanent dwelling or who are temporarily away from their usual dwelling. Properly implemented, this design eliminates the underrepresentation of mobile populations. Similar methods are commonly used in both developed and developing world contexts to measure agricultural production and livestock.14

To increase efciency and lower feldwork costs, the Afar region was divided into fve strata, defned by the expected likelihood of fnding herders and livestock. Spatial datasets describing land cover, land use, and other geographical features were used as input to delineate fve discrete, mutually exclusive strata. Te frst stratum consisted of land in or near towns; the second stratum consisted of land under permanent agriculture; the third stratum was considered to be the most likely to contain livestock, and consisted of land within two kilometers of a major water source, including the Awash River and its permanent tributaries, and which also met the criteria for pasture based on a vegetation index; the fourth stratum consisted of land between two and ten kilometers from a major water source which also met criteria of pasture land; and the ffth stratum consisted of the remainder of the land area, which was considered to have the lowest probability of fnding livestock (Fig. 4).

A total of 125 points were selected from these fve strata for the survey. Te number of selected points was higher in the strata with the highest expected concentrations of potentially nomadic households and livestock (Stratum 3) and lower in areas of lower expected

<sup>14</sup>For a more complete list of previous applications, see Himelein et al. (2014).

**Fig. 4** Stratifcation map

density (Stratum 5). Te radii for the circles also varied across the strata. In areas with higher expected densities, smaller circles were used to keep the workload manageable. In areas where few or no livestock were expected, the circle radius was expanded to the largest feasible dimensions to maximize the probability of fnding animals. Table 1 lists the defnition, sample size, and radius used in each of the fve strata.


**Table 1** Stratifcation of the Afar region

After the selection of the PSUs, teams were given maps and handheld GPS devices to conduct the surveys. Upon arriving at the center of a circle, the team canvased the circle and interviewed all households within its boundaries. Te GPS device showed the selected circle, and alerted interviewers when they crossed into or out of area.

### **3.2 Key Results**

Te pilot project of the RGCS technique to collect livestock data in the Afar region of Ethiopia demonstrated that the implementation of such a design is feasible. Of the 125 points selected, 102 were visited. Of those visited, 59 circles (58%) contained at least one livestock animal. In total, the interviewers collected information from 793 households that owned livestock, although nine of these households were shown by their GPS coordinates to be outside of the circle boundaries and were therefore excluded from the analysis, leaving a total sample size of 784. Te number of interviewed livestock-owning households per circle ranged from one to 65, with a mean of approximately 15. In total, 3698 individuals living in households owning livestock were identifed as part of the survey. Of these, 127 reported having no permanent dwelling, which is a weighted estimate of 4701, or 2% of the livestock-holding population in the study area. All but fve of the individuals without a permanent dwelling lived in households in which all members were completely nomadic. Te inclusion of households without permanent addresses in the survey was a primary objective of the original research agenda because this group is traditionally underrepresented in dwelling-based surveys.

Overall, the project showed that sufcient GIS information is available, often in the public domain, to create strata for the probability of fnding livestock, and to select points within those strata. With maps and relatively inexpensive GPS devices, interviewing teams can navigate the selected circles and identify eligible respondents within these clusters. Te identifed respondents can then be interviewed regarding their household's socioeconomic conditions and livestock holdings, creating the linkages necessary to understand the socioeconomic situation of these populations. In addition, using standard statistical methods, it is possible, although challenging, to calculate weights that take into account the varying probabilities of selection and that sufciently address overlap probabilities. Moreover, information generated as part of the GPS feld implementation can be used to account for underrepresentation, as discussed below. Finally, the methodology did what it was designed to do: Capture households without permanent dwellings that would have been excluded from a traditional dwelling-based sample design. Te identifcation and interviewing of these households proved to be a major beneft to the RGCS, compared to the traditional household-based approach to survey sampling.

## **3.3 Implementation Challenges**

Because the study area encompasses some of the harshest terrains in the region, and the methodology was novel for both the research and implementation teams, several unexpected difculties were encountered. First, seasonal rains started earlier than expected, which created access problems such as the fooding of roads and land bordering the rivers. Te access issues necessitated longer walks for interviewers, including one incident where a team had to walk 15 kilometers to reach the selected site. Other physical obstacles such as national park boundaries, active volcanoes, and militarized areas further restricted access to some locations. Tird, ongoing strained relations between local communities and the national government led to a few isolated security incidents, including minor assaults against drivers and feldworkers, and the (brief) kidnapping of the survey coordinator.

Beyond the implementation challenges, two other substantial issues arose as part of the data analysis process. Te frst was related to the calculation of the weights, which was much more complicated than originally anticipated.15 Te second challenge related to interviewers not canvasing the entire circle, and therefore missing potentially eligible respondents. Te Viewshed analysis in Fig. 5 shows the path covered by the interviewers (the white lines), the portions of the circle they could have observed during their work (green and brown terrain map), and the black squares are the areas the interviewers could not have observed based on their path of travel. Several explanations for interviewers' failure to cover the entirety of the assigned circles are possible. Te weather was extremely hot during this period. Flooding made access more difcult by requiring interviewers to take long detours on foot or ford swollen rivers. Te survey took place during Ramadan, which limited the availability of local guides to assist the teams. Alternatively, however, it is feasible that the areas not observed were missed because they could not possibly contain any livestock, for example, because of the presence of food water or vegetation too thick to traverse. Tus, the areas might be missing at random or not at random, and these two possibilities require diferent treatment in the analysis. Because it was impossible to distinguish the cause for the missed areas, two sets of statistics were reported for this study. Tis issue should be investigated closely for future implementations using this method.

<sup>15</sup>A full discussion of the correct procedure to derive probability weights is included in Himelein et al. (2014).

**Fig. 5** Viewshed analysis (Color fgure online)

# **4 Data Challenge and Innovation #3: Rapid Listing of Enumeration Areas16**

Te main challenge encountered in the Mogadishu High Frequency Survey (MHFS) Pilot17 was related to security issues, which made traditional listings of households within PSUs impossible. Te MHFS was conducted between October and December 2014 by the World Bank and Altai Consulting. In this case, the PSUs were selected from existing census enumeration area maps using probability proportional to estimated size according to the United Nations Population Fund's Population Estimation Survey. In the second stage of the survey, however, carrying out a full listing was deemed unsafe. Listing households in a PSU would require the team to spend an entire day in one

<sup>16</sup>See Himelein et al. (2017) for more complete discussion of the context and analytical approaches, as well as for the complete set of results.

<sup>17</sup>Te MHFSA is a diferent survey than the Somalia High Frequency Survey discussed in Sect. 1.1.

neighborhood, moving in a predictable pattern to reach all dwellings. Te team's prolonged presence on the ground would increase their exposure to robbery, kidnapping, and assault, and increase the likelihood that local militias would object to their presence. A random walk procedure was initially proposed as a replacement, but this method has been shown in the literature to have a high likelihood of generating biased results, even if implemented under perfect conditions.18

Te team considered four alternatives to a random walk. Te frst was to use satellite mapping to count rooftops. Tis methodology is shown in the right panel of Fig. 1 and discussed as the one-stage method used in the DRC survey above. Te second alternative was segmentation, also shown in the left panel of Fig. 2: the creation of clusters with discernible boundaries on the ground. Te third, grids, is discussed above. Te fourth alternative was a novel proposal based on a random point selection methodology, but one that considers difering probabilities of selection generated by the spatial distribution of dwellings within a PSU.

Because Mogadishu at the time was deemed too dangerous to conduct pilots of the diferent methodologies, a comparison between the methods was made using a simulation study. Te study simulated repeated sampling via the fve methods described above in three purposefully chosen PSUs which varied in size, population, and socioeconomic status. Figure 6 illustrates the size and location of the selected PSUs.

To simulate the sensitivity of each method to diferent degrees of clustering, three methods were used to assign consumption values to the dwellings. In the frst approach, values were randomly drawn from the distribution and assigned to dwellings in each PSU, resulting in no clustering in the consumption values. In the second and third approaches, the same values were reassigned within each PSU to create a moderate and a high degree of spatial clustering. After assignment, each dwelling in each of the three PSUs had three assigned consumption values.

<sup>18</sup>Bauer (2014, 2016).

**Fig. 6** Size and location of selected PSUs

### **4.1 Innovations**

Several surveys have used random point selection methodologies to select households. In these methods, a random starting point is selected, and the interviewer is instructed either to interview the nearest dwelling or to proceed in a set direction until a dwelling is reached. Te main drawback of these approaches is that the weights are difcult to calculate. Many researchers assume that the resulting sample is equal probability,19 but that is not the case. A dwelling in a large open space has a higher probability of selection than one located in a densely-populated area: More points lead to the selection of the isolated dwelling.

Te innovation proposed as part of the Mogadishu survey was to calculate the size of the "shadow" of the dwelling and use this information

<sup>19</sup>For further discussion, see Grais et al. (2007) and Kondo et al. (2014).

to estimate the probability of selection. Interviewers were instructed to travel to each preselected point within the PSU, walk in the direction of the Qibla (the direction of Mecca), and to interview the frst dwelling they reached. Tey repeated this approach until a sample of 10 dwellings had been achieved. Te Qibla was used in Mogadishu because many interviewers have an app on their cell phones that indicates this direction, but any verifable direction (north, south, etc.) would work similarly well. Te probability of selection of each dwelling is proportional to the size of its "shadow": the set of all possible points that would lead to the selection of that dwelling. Figure 7 provides a visual representation of a dwelling's shadow in the Qibla method. Other random point selection methods lead to diferently shaped shadows, but the principle is the same.

A major potential drawback of the Qibla and other related methods is the difculty of measuring the area of the shadows. If high-quality, up-todate satellite maps exist, then it is possible to use these images to calculate the shadow of a dwelling. Te size, however, would be distorted if new structures had been built or demolished since the image was taken. Calculating the area of the shadow in the feld could possibly be done by

**Fig. 7** Example of the Qibla method

asking the interviewer to walk the perimeter of the shadow with the GPS, but this would require substantial training, and may lead to measurement error. It would also increase the time spent in the feld, which was not an option in an insecure context like Mogadishu. Two alternatives were therefore used to develop a proxy for the size of the shadow: Te distance to the next structure in the opposite direction to the Qibla multiplied by the actual width of the dwelling, and the measured distance to the next structure multiplied by a categorical shadow width variable (small/medium/ large) as defned by the interviewer. In addition, the simulation tested an approach which ignored the probabilities of selection and assumed the Qibla method led to an equal probability sample of dwellings.

## **4.2 Key Results**

Figure 8 presents the results from the simulations of the fve sampling methods. In this fgure, the three PSUs are combined, but the sampling methods are shown separately. Te points are the means of the sampling distributions

**Fig. 8** Mean and confdence intervals (by method)

and the bars indicate the 5th and 95th percentiles. For each sampling method, there are three results shown: one for random assignment of consumption to dwellings (that is, no clustering); one for some clustering; and one for high clustering. Te horizontal line at 40 is the true population mean consumption level. Te results allow us to compare the methods' performance in terms of bias and variance and robustness to clustering.

Te satellite method is unbiased: Te mean over all the samples is the same as the true mean. Tis method is also unafected by clustering in the consumption variable. Tese results were expected, because this method was assumed to be equivalent to the gold standard method of a full in-feld listing; that is, the images were assumed to be up-to-date. Segmentation also showed consistently unbiased results, but higher variances for higher degrees of clustering in the underlying distribution, which is consistent with sampling theory on clustering.20 Te grid method, despite being conceptually similar to segmentation, overestimated the means with a bias up to 10% for the clustered distributions. Te bias is related to grid squares that did not have enough dwellings to meet the sample size.21 As expected, the Qibla method with the correct weights yielded unbiased results, but with wide confdence intervals, although these were partially driven by a few outliers. Te values of the 5th and 95th percentiles of the distribution for this method are similar to those in the segmentation method when clustering is applied. Te two methods of estimating the measure of size for the Qibla method showed a small amount of bias, ranging between 1.5% and 6.5%, depending on the degree of clustering. Te fnal Qibla alternative, the unweighted version, consistently underestimated the true mean. Te random walk approach, as noted above, is not theoretically unbiased, and this is refected in the simulation results.

Te main lesson learned from the simulation experiments is that full listing and satellite mapping generate the most consistently precise and unbiased results. It is also possible to generate unbiased results using a random point selection method—in this case, the Qibla method—but this approach

<sup>20</sup>See Eckman and West (2016).

<sup>21</sup>For a full discussion, see Himelein et al. (2017).

requires the accurate calculation of the area of the shadow to generate correct probability weights. If such complete data were available to researchers, satellite mapping would likely be a better choice. Te Qibla method and segmentation are both unbiased and ofer roughly similar precision, and therefore any choice between them would be based mainly on ease of implementation and the amount of information available. Te other methods considered, including the proxy weights for the Qibla method and gridding, introduce some bias, but may be acceptable if other alternatives are not feasible. Te two unweighted methods, the unweighted Qibla method and the random walk method, demonstrate the most bias, and should therefore be avoided.

# **5 Implementation Challenges**

Because this study involved simulations and no feldwork, fewer challenges were encountered. As discussed above, the main implementation challenges encountered for the Qibla method were related to the calculation of the shadow area, and by extension, the sample weights. In addition, some issues were encountered when the pre-selected point did not lead to the selection of a dwelling within the boundaries of the PSU. Tis issue was more pronounced in PSUs that had more open space, particularly on the perimeter of the city. In an actual survey environment, feldwork protocols and training would be necessary to ensure consistency in addressing these situations.

# **6 Lessons Learned and Next Steps**

It is clear from the accelerating pace of the application of GIS-based technology to sample design that the feld will continue to expand in the coming years, driven by less expensive and higher resolution imagery and the development of better algorithms. Despite the excitement that these advances generate, however, researchers and practitioners must not lose sight of the importance of calculating accurate probabilities of selection to generate unbiased estimates. As shown by the RGCS design and the Qibla method, these calculations can be challenging, and there is a need for two complementary research areas in GIS-based sampling. Te frst area where research is needed is in improved population estimates where there is no census (or equivalent) frame. Te work described above in the DRC and rural Somalia was a step in this direction. Flowminder, Facebook, WorldPop, and other groups have released population estimates.22 Te second research area is in relation to new methods of household selection when listing is not possible. Te experiments in Afar and Mogadishu ofer two alternatives. Unfortunately, both led to potentially complex weight calculations and overly variable weights, which introduce variance into estimates. New technologies, such as unmanned aerial vehicles, also have the potential to reduce the time and costs involved in listing operations.23

Cost is also an important consideration when deciding between traditional and innovative methods. Any non-traditional method will incur additional costs associated with preparation and training, but these will decrease over time as familiarity grows. For example, the DRC two-stage mapping exercise required the purchase of imagery costing \$10,000, as well as three weeks of work from an experienced GIS specialist (who was new to this specifc image processing application; a specialist with experience in the mapping application could have done the processing in less time). Imagery could also be obtained less expensively, by, for example, using lower resolution images or free OpenStreetMap data, where available.24 Te costs of the new techniques must be weighed against the costs of listing, which increases data collection costs by approximately 25% in each cluster. However, the cost of using either type of methodology is lower than employing a non-probability design, which does not guarantee reliable or representative estimates, regardless of the cost of data collection.

**Acknowledgements** Te authors gratefully acknowledge the comments and contributions of Maximilian Hirn, Siobhan Murray, Utz Pape, and Aude-Sophie Rodella of the World Bank, and Sarchil Qader of Flowminder.

<sup>22</sup>For a further discussion, see Facebook Code (2017) and LandScan (2017).

<sup>23</sup>See Eckman et al. (2018).

<sup>24</sup>For further detail and availability, see openstreetmap.org.

## **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **8**

# **Sampling for Representative Surveys of Displaced Populations**

**Ana Aguilera, Nandini Krishnan, Juan Muñoz, Flavio Russo Riva, Dhiraj Sharma and Tara Vishwanath**

# **1 Introduction**

As of April 2018, the United Nations High Commissioner for Refugees (UNHCR) reported that an estimated 6.6 million Syrians were internally displaced within the country, and that over 5.6 million Syrians

A. Aguilera (\*) · N. Krishnan · D. Sharma · T. Vishwanath World Bank, Washington, DC, USA

e-mail: aaguileradellano@worldbank.org

N. Krishnan e-mail: nkrishnan@worldbank.org

T. Vishwanath e-mail: tvishwanath@worldbank.org

J. Muñoz Sistemas Integrales, Santiago, Chile e-mail: juan.munoz@ariel.cl

F. R. Riva São Paulo School of Administration, São Paulo, Brazil

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_8

had fed to seek refuge in other countries, of which around 8% were accommodated in camps.1 In addition to these ofcial fgures, there were anywhere from 0.4 to 1.1 million unregistered Syrian refugees in Lebanon and Jordan, and an estimated one million Syrian asylum-seekers in Europe.2 In efect, more than half of Syria's pre-war population has been forcibly displaced since the beginning of the Syrian civil war.

Te Syrian crisis has caused one of the largest episodes of forced displacement since World War II and some of the densest refugee-hosting situations in modern history. Syria's immediate neighbors host the bulk of Syrian refugees: Turkey, Lebanon, and Jordan rank in the top fve countries globally for the number of refugees hosted—according to UNHCR data, as of June 2018, Turkey hosted 3.5 million Syrian refugees, Lebanon 0.97 million, and Jordan 0.66 million. In fact, Lebanon and Jordan hold the top two slots for per-capita recipients of refugees in the world, at 164 and 71 refugees per 1000 inhabitants, respectively (UNHCR 2019).3 Te infux into these countries has also occurred at a more rapid rate than prior refugee crises. At one point in the confict, an average of 6000 Syrians were feeing into neighboring countries every day.4 Beyond the immediate impact of infow of refugees, the host countries are also dealing with other consequences of the

<sup>1</sup>http://www.unhcr.org/en-us/syria-emergency.html.

<sup>2</sup>According to a 2014 background paper on Unregistered Syrian Refugees in Lebanon, from the Lebanon Humanitarian INGO Forum, "general estimates and media reports citing unnamed Lebanese ofcials put the number of Syrians living in Lebanon and not registered with UNHCR between 200,000 and 400,000, although the reliability of and sources for these estimates—which do not distinguish between those in need of protection and/or assistance and those not in need are unknown" (Lebanon Humanitarian INGO Forum 2014). Te paper cites a range of estimates (from around 10 to 50%) based on data from various sources, with difering coverage and survey periods. Te 2015 Jordanian census estimated 500,000–600,000 more Syrians than the numbers registered with UNHCR.

<sup>3</sup>Since these fgures are based on ofcial UNHCR registration numbers, they do not refect the unknown number of unregistered refugees, as already noted in footnote 2. At the end of 2014, the United Nations estimated that registered Syrian refugees represented 29% of the total population in Lebanon and 9.5% of the total population in Jordan. Areas with the largest number of Syrians, such as the Bekaa Valley in Lebanon, have seen much higher proportions of refugees to local citizens.

<sup>4</sup>Quoted by the UN High Commissioner for Refugees in a speech to the United Nations Security Council in 2013.

Syrian confict, including the disruption on trade and economic activity and growth and spread of the Islamic State (also called ISIS) in Iraq. While the Kurdish Region of Iraq (KRI) hosts at least 200,000 Syrian refugees, the ISIS-induced displacement from neighboring parts of Iraq means that KRI is now hosting over 2.25 million displaced persons, equivalent to approximately 40–50% of its population.

While each neighboring country has received many Syrian refugees in both absolute and relative terms, that is where the commonality ends. Each country has responded to the infux in its own way, infuenced by its previous experience of handling protracted displacement situations. Given its history of encampment of the displaced Palestinian population, Lebanon has refrained from setting up camps for Syrians. Tere is also understandable wariness and anxiety of the impact the infux may have in the delicate domestic political power-sharing equilibrium. In KRI, the infux of Syrian refugees overlaps with a signifcant number of Iraqi citizens seeking a safe haven from the ISIS militants. Te refugees and internally displaced people (IDPs) are located both in camps and non-camps, with a very porous camp boundary that allows its residents to move freely and work outside the camp. At the time of the survey, Jordan had an explicit policy to house refugees in camps and few refugees have legal residency and/or work permits, although a signifcant majority of refugees had moved outside the camps.

Creating an evidence base to frame the policies for refugees in host environment requires a sampling methodology to select a sample that represents both the host and refugee populations. Tere are several challenges associated with conducting a representative survey of the host community population and the forcibly displaced. In all three settings we consider, a reliable and updated sampling frame for the resident population was not available.5 No sample frames existed for forcibly displaced populations as they were excluded from available national sampling frames. Databases maintained by humanitarian agencies for internal programming purposes are often incomplete and out of date.

<sup>5</sup>Te last ofcial population census in Lebanon was in 1932 and the available sampling frames were also considerably dated in Jordan and KRI.

Te displaced also have high degree of mobility and they are often unwilling to speak to surveyors. In this context, and in similar contexts of forced displacement, the selection of a representative sample of hosts and the displaced becomes a major challenge to drawing credible inferences about their socio-economic outcomes.

In this chapter, we describe the strategies that had to be devised to overcome these challenges when designing the sampling procedure for the Syrian Refugee and Host Community Surveys (SRHCS), which were implemented over 2015–2016 in Lebanon, Jordan, and the Kurdistan region of Iraq.6 Section 2 describes the innovative use of available information to come up with a strategy for generating representative samples of host community and refugee households in the three settings. Section 3 presents the implementation of this strategy. Section 4 concludes by highlighting implementation challenges and drawing general lessons from our experience on sampling forcibly displaced populations.

# **2 The Innovation**

In all three settings, the main challenge to implementing a survey that would yield estimates representative of the refugee and host community populations, was the lack of an updated or comprehensive sample frame, including for hosting populations and especially for displaced populations. In general, the latter were completely missing from existing national sample frames. None of the three countries had at the time, a recent population and housing census, duly updated for population growth and movement, which could have provided the frame to choose the survey sample for the hosting community.

Each of the three contexts presented diferent challenges. Lebanon and Iraq have both not had a census for several decades and existing sample frames were out of date at the time of the SRHCS. In Lebanon, information from this sample frame was not available at low levels of geographic disaggregation, while in Iraq, internal displacement of

<sup>6</sup>Te survey was conducted to support analysis on impacts of the infux on local communities in the three settings (see World Bank 2018b).

millions of Iraqis had made existing frames obsolete. In Jordan, while census exercises are undertaken every decade, data from the most recent census was not available for the SRHCS, and we had to rely on a relatively outdated sample frame based on the 2005 census. Diferences in the distribution of Syrian refugees across the three contexts implied a country-specifc approach as well. In Lebanon, there were no refugee camps for Syrians; in Jordan, there were two main refugee camps for Syrians; and in Kurdistan, Iraq, Syrians as well as Iraqi IDPs lived in camps but were also free to move in and out.

Defning a sampling strategy to yield representative samples of hosts and displaced populations in this context involved two key innovations. Te frst was the creation of a sample frame feasible for household listing operations from large geographical divisions where it did not exist. Tis was the case in Lebanon and among the two largest refugee camps in Jordan. In Lebanon, cartographic divisions of the country were only available for large areas, and had to be segmented and subsegmented based on satellite imagery and dwelling counts to yield geographic areas small enough for listing. Tese segmentations attempted to divide the larger areas into equal population size subdivisions or segments, much the same way as enumeration areas are generated. Similarly, for the two largest refugee camps in Jordan, Zaatari, and Azraq, satellite imagery was used to divide the camps into mutually exhaustive and exclusive sampling units of roughly equal population size.

Te second innovation was the use of available information from diferent sources on displaced population prevalence which were incorporated into the sample frames of host population prevalence. In most cases, this information was only available at a geographic level higher than the smaller sampling units used in the fnal frame. Tis data allowed for the estimation of known probabilities of selection. Te frst stage sample selection assumed these probabilities were uniformly distributed over the larger geographic area, and in the sampling units within that area. Te household listing operation in the selected small sampling units was then used to update this known (albeit incorrect) probability of selection. In Lebanon and Kurdistan, auxiliary information on spatial distribution of refugees and IDPs available from the UNHCR and the International Organization for Migration (IOM), was merged with the sampling frame. Subdistrict level refugee and IDP prevalence information was used to stratify subdistricts by intensity of prevalence: low, middle, and high. Te sample was further stratifed into subgroups of interest, depending on the context. In Lebanon, the survey was representative of the host community and the Syrian refugee population. In Kurdistan, the scope of the survey was expanded to include IDPs, so that the survey was representative of the host community, Syrian refugees inside and outside of camps, and IDPs inside and outside of camps.

# **3 Implementation**

In what follows, we detail the sampling strategy for Lebanon, which was the most complicated, and then describe the strategy for the other two contexts.

**Lebanon.** Conducting a representative survey in Lebanon was especially challenging. Te frst difculty was that, as of 2015, there was no recent or reliable sample frame, even for Lebanese households, as the last ofcial population census was conducted in 1932. Typically, such a sample frame consists of the universe of enumeration areas in a country, with associated estimates of population. Tis meant that we had to construct our own sample frame by selecting a few Small Area Units (SAUs) and then conducting a full listing operation by visiting every household within the selected SAUs and collecting basic demographic and contact information. Te second difculty was that there was no available cartographic division of the country into geographic areas small enough to be the subject of a full listing operation, which could then serve as a sampling frame for the SAUs. Circonscription Foncières (CF) were the fnest level of disaggregation available; CFs are generally too large to be listed as some have populations of over 100,000. Finally, there was no available sampling frame for Syrian refugees in Lebanon, which meant that we had to depend on UNHCR data on registered Syrian refugees, combined with the estimates of Lebanese population at the CF level. Given these challenges and time and budgetary constraints, the sample was selected in multiple (four) stages as described below.

### **3.1 First Sampling Stage**

Te sample frame for the frst stage is the list of 1301 CFs published by the Council for Development and Reconstruction (CDR) in 2004 and the 2014 UNHCR registration database. Each CF is identifed by way of its administrative afliation—Kaza, Qadha, and Mohafza. Te UNHCR database reports the total population in each CF, as well as the number of Lebanese and Syrian population in each.7,8,9 Te CF cartographic boundaries are described digitally in a linked Geographic Information System shape fle.

Te CFs were sorted into three strata depending on their ex-ante prevalence of Syrian population, as follows:


Prevalence of Syrian refugees at the CF level was defned as the number of registered Syrian refugees from the 2014 UNHCR database divided by the sum of the number of registered Syrian refugees and the 2004 Lebanese population counts from the CDR database. Te frst columns of Table 1 show the distribution of the CFs into strata, as well as the population in each stratum, as per the UNHCR database.

<sup>7</sup>Lebanese population distribution by cadasters, supplied by CDR Shapefle (2002–2003); Population estimate of Lebanese 4 million referenced in the Lebanon Crisis Response Plan (LCRP) (UNHCR 2015).

<sup>8</sup>Total population of Syrian refuges as reported by the UNHCR registration database as of December 2014.

<sup>9</sup>Total population of Palestinian refugees in Lebanon (PRL) estimated between 260,000 and 280,000 (UNRWA-AUB 2010). Database provided the population distribution by camps and gatherings. In addition, the total population of Palestinian refugees from Syria is estimated to be 43,000 according to the UNRWA; UNHABITAT UNDP study on gatherings.

Our intention was to select 75 CFs in total. Te decision of how to distribute them across the 3 strata faced the classical dilemma of whether to do it in proportion to the population of the strata, which would deliver nearly optimal estimates for the country as a whole, or to allocate the same sample size (i.e. 25 CFs) to each stratum, which would deliver estimates of nearly the same quality for each of them. Since both considerations were important for the 2015 SRHCS, we opted to do it in accordance to Markwardt's rule (also known as the '50/50 equal/proportional allocation'), which is generally considered a good compromise between the two extremes. Te last three columns in Table 1 show the chosen allocation, the corresponding sample sizes (in number of households), and the expected maximum margins of error.<sup>10</sup>

Within each stratum, CFs were selected for inclusion with probability proportional to size (PPS), using the total population as a measure of size, and with implicit stratifcation by administrative units (Kaza, Qadha and Mohafza). Some of the large CFs were selected more than once. For instance, there were 34 selections made from among the 'low prevalence' CFs (as per Table 1), and one extremely populous CF (Chiyah, located in Mount Lebanon) was randomly selected three times. As a result, the 75 selections were drawn from 71 diferent CFs. Annex Table 1 shows the list of sampled CFs, where the last column indicates the number of times each CFs was selected in the sample (e.g. one, two or three times depending on each case).

<sup>10</sup>More precisely, the last column of Table 1 shows the maximum expected margins of error for the estimation of a household-level prevalence *P* (such as the percentage of households with children, the percent of households reporting illnesses, etc.) at the 95% confdence level. Tese are given by ME=1.96 [Def *P* (1–*P*)/*n*] 0.5, where *n* is the sample size and Def is the *design efect*, basically due to the tendency of neighboring households to behave similarly in regards the indicator being observed. Te column was computed for Def=2 (a value found in practice for many indicators of interest) and *P*=0.5 (for which ME is maximum).

### **3.2 Segmentation of Circonscriptions Foncières (PSUs)**

Given that CFs are larger in size than typical census Enumeration Areas which are roughly of 200 households each, the majority of the selected sample CFs was too large to be manageable for implementing a complete household listing operation. For this reason, these large CFs were divided into 'super segments' and 'segments' of roughly equal size within each category, using total number of households as a measure of size. Te number of households in each 'super segment' or 'segment' was estimated based on observation of height of buildings and estimated population density in each area in the 2015 ESRI World Imagery11 and 2015 Google Earth imagery, combined with local knowledge of these areas.

Based on the estimated measure of size, only fve CFs were considered to be too large in size and hence were selected for 'super segmentation'. At a later stage, all CFs and 'super segments' were divided into 'segments' due to their large size.

## **3.3 Second Sampling Stage: Super Segmentation of Circonscriptions Foncières**

In the second stage, the boundaries of the 'super segments' in each CF were drawn using the 2015 ESRI World imagery basemap. Tese boundaries take into account the total estimated household count, as well as natural boundaries such as major roads, rivers, and paths that can easily be recognizable by feld teams during the listing operation and implementation of the household questionnaire.

Within each super-segmented CFs, the sample 'super segments' were selected with equal probability, based on the assumption that each 'super segment' is of roughly equal size. Te number of 'super segments' selected within each CF was the same as the number of times the corresponding CF was selected in the frst sampling stage. For instance, if a

<sup>11</sup>Esri, DigitalGlobe, GeoEye, Earthstar Geographics, CNES/Airbus DS, USDA, USGS, AEX, Getmapping, Aerogrid, IGN, IGP, swisstopo, and the GIS User Community.

CF was selected three times in the frst sampling stage, we selected three 'super segments' within this CF. Similarly, if a CF was selected only once or twice on the frst sampling stage, we correspondingly selected one or two 'super segments' on the secondary sampling stage.

Annex Table 2 shows the list of 'super segments' within selected CFs, where the ninth column indicates the number of times each CFs was selected in the sample (e.g. one, two or three times depending on each case). Te column headed 'Prob 2' shows the probability of selecting the 'super segment' within each CF.

## **3.4 Third Sampling Stage: Segmentation of Circonscriptions Foncières**

In a third stage, the boundaries of the 'segments' were drawn for all CFs and selected 'super segments' within CFs. Similar to the process of 'super segmentation', boundaries of segments were drawn using the 2015 ESRI World imagery basemap. Tese boundaries also take into account the total estimated household count, as well as natural boundaries such as major roads, rivers, and paths.

Within each CF or corresponding 'super segment', the sample 'segments' were selected with equal probability, with the underlying assumption that each 'segment' is of roughly equal size. Annex Table 3 shows the list of 'segments' for all CFs, where the last column indicates the probability of selecting the 'segment' within each CF in the third sampling stage.

# **3.5 Fourth Sampling Stage**

Te sample frame for the fourth stage is the full list of all households in the sample CF segments. Te listing operation consisted of a full enumeration of all physical structures in the area, with each physical structure being classifed as a primary or secondary residential dwelling, commercial building, school, hospital, government ofce, etc. Te listing operation collected information about the household occupying each residential dwelling, and each household was classifed as either a Syrian refugee household or a host community household. Care was also taken to record two households living in the same unit separately.12

To ensure the quality and completeness of the listing operation, enumerators relied on high-resolution paper maps identifying all buildings within each segment. Each building or structure was pre-assigned with a unique identifer. Enumerators then created a record for each residential unit and household following the protocol described in the 2015 SRHCS Manual of Enumerator. Te 40 households to be visited by the 2015 SRHCS in each segment (with a target of 20 Syrian refugee and 20 non-Syrian refugee households in each) was selected from the listing data by systematic equal-probability sampling.13

### **3.6 Selection Probabilities and Sampling Weights**

Given the sampling design discussed in the last paragraphs, the probability *p*hizsj of selecting household hijzsj in segment hizs of super segment hiz in Circonscription Foncière hi of stratum *h* is given by:

$$p\_{\rm hizsj} = \frac{k\_h n\_{\rm hi}}{\sum\_i n\_{\rm hi}} \times \frac{t\_{\rm hi}}{T\_{\rm hi}} \times \frac{g\_{\rm hi}}{G\_{\rm hi}} \times \frac{m\_{\rm hij}}{n\_{\rm hi}^{'}}$$

where the four fractions on the right-hand side respectively represent the probability of selecting the CF in the frst stage, and the conditional

<sup>12</sup>One segment (in the Saida Ed-Dekermane CF, segment number 61119-0-26) was dropped from the original sample since the feld team could not get access to the area due to insecurity and was thus unable to implement the household listing operation. Terefore, the intended sample of 40 household in this segment was distributed among two other similar segments, selecting 20 additional households in each. Te selection of these two segments was based on the household listing data and local knowledge provided by the survey frm. Te two identifed segments are located in Saida Al-Qadima and Mazraa 2 (Beirut) and are similar to the Saida Ed-Dekermane segment in that they have: (i) a high share of Palestinian refugees; (ii) high density of urban population; and (iii) high poverty rate.

<sup>13</sup>After listing, only 15 households were found in segment 31116-11. Terefore, all eligible households were selected for interviewing (full census). Te total sample size was reduced by 25, for a total 2975 sample households.

probabilities of selecting the super segment, the segment, and the household in the second, third, and fourth stages, and:


To deliver unbiased estimates from the sample, the data from each household hij should be afected by a sampling weight (or raising factor) whzsij, equal to the inverse of its selection probability (i.e. whizsj=phizsj−1).

**Kurdistan**. Much of the sampling procedure in Kurdistan resembled that of Lebanon, except for one important diference: unlike in Lebanon, the frame for the frst stage sample existed in Kurdistan (albeit outdated), and a subset of the enumerations areas had updated population information from the 2012 IHSES survey (which did not take into account subsequent internal displacement). A subsample of the 2012 clusters was selected for our survey, followed by a comprehensive listing exercise to update the frame for second stage sampling. Four strata based on refugee and IDP prevalence were defned as following:


In the frst stage, within each stratum, enumeration areas were selected with PPS using the number of households reported from the 2012 listing exercise as a measure of size. In the second stage, 18 households per PSU were selected: six Syrian households, six IDP households, and six host community households in each PSU to the extent possible. In areas where there were less than six Syrian or IDP households, the shortfall was met by host community households. Te sampling frame for second stage sampling was the complete list of households in the selected EAs from the listing exercise.

**Jordan**. In contrast to Lebanon and Iraq, Jordan has carried out Population and Housing Censuses on regular intervals, with the last one in late 2015. What was particularly attractive about the latest census from the perspective of sampling was that it explicitly asked about the nationality of all residents. Tis would have allowed stratifcation of areas by density of Syrians. However, the original design could not be implemented because we could not access the new sample frame based on the 2015 Jordanian census. Te design was then amended to include a representative sample of the Azraq and Za'atari camps (which account for the vast majority of Syrian refugees in camps in Jordan). Tis sample was complemented by purposive samples of the surrounding governorates, Mafraq and Zarqa, where the sample included areas physically proximate to the camp and other areas with a high number of Syrian refugees. In Amman Governorate, a purposive sample was drawn, combining a geographically distributed sample with a sample of areas with a high prevalence of Syrian refugees per the 2015 census, as indicated by the Jordanian Department of Statistics. Analytically, this implies the insights from Jordan will be limited to camp residents, neighboring areas of the camps, and Amman governorate.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

Te three surveys described in this paper were designed to generate comparable fndings on the lives and livelihoods of Syrian refugees and host communities in the three settings. Te absence of updated national sample frames and the lack of a comprehensive mapping of the forced displaced within these countries posed challenges for the design of these surveys. Tese challenges are not unique—indeed, most developing countries face similar issues, which are exacerbated at times of large scale internal population movements or in contexts of a large localized or widespread infux of migrants. Such data challenges become particularly stark in countries hosting displaced populations or in situations of ongoing or protracted confict as local populations move to escape violence. But exclusion of displaced persons from national sampling frames, and consequently from national surveys, provides a skewed picture of the world (World Bank 2018a). As the number of displaced persons continues to increase, it becomes all the more urgent to devise strategies to include them in representative socioeconomic surveys.

Tis methodology paper describes the strategy implemented in the three contexts to generate known ex-ante selection probabilities through a variety of data sources, the use of geospatial segmenting to create enumeration areas where they did not exist, and to use data collected by humanitarian agencies to generate sample frames for displaced populations. Te strategies implemented in these surveys can be useful in designing similar exercises in contexts of forced displacement. Moreover, this efort shows the importance of including refugees and non-nationals in national sample frames. Te move by Jordan's statistical agency to explicitly include non-nationals in the 2017/2018 household survey is a commendable step in the right direction.

# **Annex**

See Tables 1, 2, and 3.




**Table 2** List of selected segments (enumeration areas)—Lebanon


**Table**


**8 Sampling for Representative Surveys of Displaced Populations 145**

CF selected

(continued)





#### **8 Sampling for Representative Surveys of Displaced Populations 147**

(continued)


**Table 3** (continued)


**Table 3** (continued)

# **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **9**

# **Rapid Consumption Surveys**

**Utz Pape and Johan Mistiaen**

# **1 The Data Demand and Challenge**

Poverty is the paramount indicator used to gauge the socioeconomic well-being of a population. Particularly after a shock or in a volatile context, poverty estimates can identify who was afected, and how severely. Tis is particularly relevant in fragile countries where monitoring poverty dynamics help measure the country's progress toward stability, or increased risk of relapsing into confict. As one of the main indicators for poverty, monetary poverty is measured by a welfare

Tis chapter is a summary of Pape, Utz Johann, and Johan Mistiaen. "Household Expenditure and Poverty Measures in 60 Minutes: A New Approach with Results from Mogadishu." Policy Research Working Paper Series. Te World Bank, 2018. https://ideas.repec.org/p/wbk/ wbrwps/8430.html.

U. Pape (\*) · J. Mistiaen World Bank, Washington, DC, USA e-mail: upape@worldbank.org

J. Mistiaen e-mail: jmistiaen@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_9

aggregate, usually based on consumption in developing countries and a poverty line. Te poverty line indicates the minimum level of welfare required for healthy living.

Consumption aggregates are traditionally estimated based on time-consuming household consumption surveys. A household consumption questionnaire records consumption (how much was consumed) and expenditure (how much was purchased, or obtained in other ways like gifts or aid) for a comprehensive list of food and non-food items. Covering between 300 and 400 items, the questionnaire often exceeds 120 minutes to administer. In addition to the longer administering time leading to higher costs, response fatigue can increase measurement error, especially for items at the end of the questionnaire. In a fragile country context, a face-to-face time of 90–120 minutes can be prohibitively high. In the case of Somalia, security concerns restricted the duration of a survey visit in Mogadishu to about 60 minutes.

Te extensive nature of household consumption surveys makes it difcult to obtain updated poverty estimates, especially when they are needed the most, such as after a shock and in fragile countries. Approaches have therefore been developed to reduce administering times to allow for the collection of consumption data. Te most straightforward approach to minimize administering time is to reduce the number of items surveyed, either by asking for aggregates, or by skipping less frequently consumed items, which is called the reduced consumption methodology. However, both approaches—using aggregates, and skipping less common items–have been shown to underestimate consumption, which in turn overestimates poverty.1 Splitting the questionnaire to allow for multiple visits is another solution, but potential attrition issues especially in fragile contexts increases the required sample size and may be costlier. In addition, multiple visits to the same household can increase security concerns.

Te second class of approaches utilizes a full consumption baseline survey and updates poverty estimates based on a small subset of collected

<sup>1</sup>Beegle et al. (2012).

indicators.2 Tese approaches estimate a welfare model based on the baseline survey using a small number of easy-to-collect indicators. Tis allows poverty estimates to be updated by collecting only the set of indicators instead of the direct consumption data. While this approach is cost-effective and easy to implement in normal circumstances, it has two major drawbacks in the context of fragility and shocks. First, the approach requires a baseline survey, which is sometimes not available, as in the case of Mogadishu. Second, the approach relies on a structural model estimated from the baseline survey.3 In the case of shocks, structural assumptions that cannot be tested are often violated. Tus, poverty updates based on the violated assumptions tend to underestimate the impact of the shock on poverty. Terefore, cross-survey imputation methodologies are not applicable in the context of shocks and fragility.

# **2 The Innovation**

To assess poverty in Mogadishu, we tested a new methodology combining an innovative questionnaire design with standard imputation techniques. Tis substantially reduces the administering time of a consumption survey from multiple hours or even days to about 60 minutes, while still resulting incredible poverty estimates. Te gain in shorter administering time, however, is ofset by the need to impute missing consumption values. Given the design of the questionnaire, this method circumvents the systematic biases identifed for alternative methodologies.

## **2.1 Overview**

Te rapid consumption survey methodology involves fve main steps (Fig. 1). First, core items are selected based on their importance for consumption. Second, the remaining items are partitioned into optional

<sup>2</sup>Douidich et al. (2013); SWIFT.

<sup>3</sup>Christiaensen et al. (2011).

#### **156 U. Pape and J. Mistiaen**

*dŚĞĐŽŶƐƵŵƉƚŝŽŶŵŽĚƵůĞŝƐƉĂƌƚŝƚŝŽŶĞĚŝŶƚŽĐŽƌĞĂŶĚŽƉƚŝŽŶĂů ŵŽĚƵůĞƐ͕ǁŚŝĐŚŝŶƚƵƌŶĂƌĞĂƐƐŝŐŶĞĚƚŽ ŚŽƵƐĞŚŽůĚƐ͘ŽŶƐƵŵƉƚŝŽŶŝƐŝŵƉƵƚĞĚƵƚŝůŝnjŝŶŐƚŚĞƐƵďͲƐĂŵƉůĞŝŶĨŽƌŵĂƚŝŽŶŽĨƚŚĞŽƉƚŝŽŶĂůŵŽĚƵůĞƐĞŝƚŚĞƌďLJ ƐŝŶŐůĞŽƌŵƵůƚŝƉůĞŝŵƉƵƚĂƚŝŽŶŵĞƚŚŽĚƐ͘*

**Fig. 1** Illustration of the rapid consumption survey methodology (using illustrative data only)

modules. Tird, optional modules are assigned to groups of households. Fourth, after data collection, consumption of optional modules is imputed for all households. Fifth, the resulting consumption aggregate is used to estimate poverty indicators.

First, core consumption items are selected. Consumption in a country bears some variability, but usually a small number of a few dozen items captures the majority of consumption. Tese items are assigned to the core module, which will be administered to all households. Important items can be identifed by their average consumption share per household or across households. Previous consumption surveys in the same country, or consumption shares of neighboring or similar countries can be used to estimate consumption shares.

Second, non-core items are partitioned into optional modules. Diferent methods can be used for this partitioning. In the simplest case, the remaining items are ordered according to their consumption share and assigned one by one while iterating the optional module in each step. A more sophisticated method takes into account the correlation between items, and partition them in a way so that all items within a module explain consumption as best as possible, while the information between modules should be highly correlated. Te partitioning infuences the standard error of the estimation, but does not introduce bias. Tus, even in the absence of a previous survey, this methodology can be applied. More complicated partition patterns can result in a set of very diferent items in each module. However, the modular structure should not infuence the layout of the questionnaire. Instead, all items should be grouped into categories of consumption (e.g. cereals) and diferent recall periods. It is therefore recommended to use CAPI technology, which allows the structure of the consumption module to be hidden from the enumerator.

Tird, optional modules should be assigned to groups of households. Optional modules should be assigned randomly, stratifed by clusters to ensure appropriate representation of optional modules in each cluster. Tis means that each cluster should include about the same number of households assigned to each optional module. Tis step is followed by the actual data collection.

Fourth, household consumption should be estimated by imputation. Te average consumption of each optional module can be estimated based on the subsample of households assigned to the optional module. In the most straightforward case, a simple average can be estimated. More sophisticated techniques can employ a welfare model based on household characteristics and consumption of the core items. Te next section presents six techniques and demonstrates their performance on the dataset from Hargeisa.

Single imputation of the consumption aggregate underestimates the variance of household consumption. Depending on the location of the poverty line relative to the consumption distribution, this may either consistently under- or overestimate poverty. Multiple imputations based on bootstrapping can mitigate the problem but will render analysis more complicated. We use single as well as multiple imputation techniques for the evaluation of the methodology.

# **3 Key Results**

In this section, the rapid consumption methodology will frst be applied to a dataset including a full consumption module from Hargeisa, Somaliland. Tis will be used to assess the performance of the rapid consumption methodology compared to the traditional full consumption methodology. Te results of the High Frequency Survey in Mogadishu are then presented. Security risks in Mogadishu restrict faceto-face interview time to less than one hour; therefore, the rapid consumption methodology was used to derive the frst ever consumption estimates for Mogadishu. We present the resulting consumption aggregate, and perform consistency checks for its validation.

## **3.1 Ex Post Simulation**

Te rapid consumption methodology is applied ex post to household budget data collected in Hargeisa, Somaliland. Hargeisa was chosen as it is very similar to Mogadishu. Using the full consumption dataset from Hargeisa allows a full assessment of the new methodology. Based on selected indicators, we compare the results of the estimated consumption based on the rapid consumption methodology with the results from using the traditional full consumption module. We add a comparison with the results for a reduced consumption module.

Te simulation assigns each household to one optional module. Te consumption data for the modules not assigned to the household is deleted. Multiple simulations are performed, with various modules being assigned to households. Across the simulations, we calculate three consumption indicators and four poverty and inequality indicators. Te


**Table 1** Number of items and consumption share captured per module

consumption indicators capture the accuracy of the estimation at three diferent levels: the household level, the cluster level (consisting of about nine households), and the level of the dataset. In addition, we calculate the poverty headcount (FGT0), the poverty depth (FGT1), the poverty severity (FGT2), and the Gini coefcient to capture inequality.

Six estimation techniques are compared with respect to their relative bias and relative standard error, based on 20 simulations. All simulations used the same item assignment to modules using the algorithm as described (see Table 1 for the resulting consumption shares per module).4 Te estimation techniques difer considerably in terms of performance. We also compare the techniques to using a reduced consumption module where the same consumption items are collected for all households. Te number of items is equal to the size of the core module and one optional module, implying a comparable face-to-face interview time to the rapid consumption methodology.

Comparing the reduced consumption approach with the full consumption as a reference, the reduced consumption approach sufers from an underestimation of consumption. Tis is not surprising because the approach only collects information on the consumption of a subset of items. Applying the median as a summary statistic also results in an underestimation of consumption. As consumption distributions have a long right tail, the median consumption belongs to a poorer household than the average household. In the case of Hargeisa, several optional

<sup>4</sup>We performed robustness checks with diferent item assignment to modules, including setting the parameter *d*=1 and *d*=2. Te estimation results are extremely robust to changes in the item assignment to modules.

**Fig. 2** Average relative bias and standard error

modules have a median of zero consumption. Tus, the median underestimates the consumption in a similar way to the reduced consumption approach. In contrast, the average consumption of households is larger than the consumption of the median household. Tus, it is not surprising that the technique using the average as a summary statistic overestimates total consumption at the household and cluster levels.

Te regression techniques have a similar performance, with a considerable upward bias at all levels. Te Tobit regression performs slightly better at the household and cluster levels. As known from literature about small area estimates, the regression approaches do not model the error distribution correctly and, thus, underestimate the tails of the distribution. Depending on the value of the poverty line relative to the mode of the distribution, this results in an over- or under-estimation of the poverty rate. In contrast, both imputation techniques perform exceptionally well, with a bias below 1% at all levels (Fig. 2).

While the bias is important in order to understand the systematic deviation of the estimation, the relative standard error helps to understand the variation of the estimation. Other than in a simulation setting, the standard error of the estimation cannot be calculated, as only one assignment of households to optional modules is available. Tus, it is important that the estimation technique delivers a small relative standard error.

Generally, the relative standard error reduces when moving from the household level over the cluster level to the simulation level. Te relative standard error for the reduced consumption methodology is smaller than for the summary statistic techniques because the reduced consumption is not subject to variation from the module assignment to households. Te regression techniques have large relative standard errors of around 20% at the household level, while the multiple imputation techniques vary between 15 and 20%. At the cluster level, the relative standard error drops to 7% for regression techniques and 5% for multiple imputation techniques. At the simulation level, the relative standard error is around 3% for regression techniques and 1% for multiple imputation techniques.

Te distributional shape of the estimated household consumption level can be compared to the reference household consumption by employing standard poverty and inequality indicators. Te poverty headcount (FGT0) is 57.4% for the reference distribution.5 Not surprisingly, the reduced consumption technique and the median summary statistic overestimate poverty by several percentage points due to the underestimation of consumption, while the average summary statistic and the regression techniques underestimate poverty, since they overestimate consumption. Te multiple imputation techniques overestimate poverty, but only by 0.5 percentage points (or about 1%), performing signifcantly better than the reduced consumption approach, which has a bias that is more than two times larger. Te reduced consumption technique and the median summary statistic as well as the multiple imputation techniques deliver good results for FGT1 and FGT2, emphasizing that not only can the headcount be estimated reasonably well, but the distributional shape is also conserved. With the exception of the median summary statistic, these techniques also perform well estimating the Gini coefcient, with a bias of less than 0.5 percentage points. Te relative standard errors show similar results as for the estimation of the consumption. Te relative standard error of the reduced

<sup>5</sup>Te FGT0 is calculated based on the US\$1.90 PPP (2011) international poverty line, converted into local currency in 2013.

**Fig. 3** Bias and standard errors

consumption for FGT0 is double that of the multiple imputation techniques. Te relative standard errors for the multiple imputation techniques for FGT1 are comparable but larger than for FGT2 and Gini (Fig. 3).

In conclusion, the average summary statistic and the regression approaches cannot deliver convincing estimates. While the reduced consumption technique and the median summary statistic perform considerably better, they both overestimate poverty. Only the multiple imputation techniques are convincing in all estimation exercises. In terms of the estimation of the important poverty headcount (FGT0), the multiple imputation techniques are virtually unbiased.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

In late 2014, consumption data using the proposed rapid consumption methodology was collected in Mogadishu using CAPI. Te rapid consumption questionnaire reduced face-to-face interview time considerably. A household visit took about 40 minutes on average (with a median of 35 minutes), including greetings, household characteristics, consumption modules, and a number of perception questions. Nine out of ten interviews took less than 65 minutes.

After data cleaning and quality assurance procedures, 675 households with consumption data were retained.6 A welfare model was built to predict missing consumption in optional modules. Te welfare model was tested on the core consumption, after removing the core consumption as an explanatory variable. Te model for food consumption retrieved an R2 of 0.24, while non-food consumption was modeled with an R2 of 0.16. It is important to emphasize that these models give a lower bound of the R2 compared to the models used in the prediction, as the prediction models include the core consumption as an explanatory variable. Given the assessment of the diferent estimation techniques in the previous section, the multivariate normal approximation using multiple imputations is applied to the Mogadishu dataset.

For the Mogadishu dataset, the assignment of items to modules had to be manually refned.7 Te refnement had a minor impact on the share of consumption per module. It is curious, though, that the share of consumption per module is diferent for Hargeisa and Mogadishu. Using the Hargeisa dataset, 91% of food consumption (and 76% of non-food consumption) is captured in the core module. In contrast, the core food consumption share is only 64% (and 62% of non-food consumption) in Mogadishu before imputing the consumption of non-assigned modules. Tus, employing a reduced consumption module based on consumption shares identifed in Hargeisa would have crudely underestimated consumption in Mogadishu, without being able to evaluate the inaccuracy. In contrast, the rapid consumption methodology allows the estimation of shares for each module, while the consumption

<sup>6</sup>While the survey also covered IDP camps, the analysis presented is restricted to households in residential areas, excluding IDP camps.

<sup>7</sup>Manual refnement is necessary to ensure that items like 'other fruits' do not double-count types of fruits not assigned to the household. Tis is implemented by relabeling and manually assigning modules. In addition, some item groups items were split into individual items, which is generally preferable for recall and recording, as well as calculation of unit values.

estimation procedure implicitly takes into account the 'missing' consumption shares for each household (Table 2).

Te cumulative consumption distribution can be compared for the consumption captured in the core module, the assigned optional modules, and the imputed consumption. By construction, the core consumption shows the lowest consumption per household. Adding the consumption from the assigned optional modules shifts the cumulative consumption curve slightly. Te imputed consumption is shifted even further as the estimated consumption shares from the non-assigned modules are added (Fig. 4).

Without full consumption aggregate values for Mogadishu, we can only show the consistency of the retrieved consumption aggregate with other household characteristics to validate the estimates. Consumption per capita usually reduces with increasing household size. Indeed, we fnd that household size is signifcantly negatively correlated with estimated per capita consumption.8 Per capita consumption also decreases with a larger share of children among the household members. Te proportion of employed members of the household signifcantly increases consumption per capita. Tus, the retrieved consumption estimate is consistent and using the evidence from the ex post simulations, highly accurate.

Te results of the ex post simulation indicate that the rapid consumption methodology can reliably estimate consumption and poverty. Te experience in Mogadishu also shows that the rapid consumption methodology can be implemented in extremely high-risk areas, due to its success in limiting face-to-face interview time to less than one hour. While these results are encouraging, the rapid consumption methodology has some limitations.

Te rapid consumption questionnaire varies in comprehensiveness and the order of items in the consumption module between households.

<sup>8</sup>Te reported numbers are corrected against correlation with household characteristics included in the welfare model. As the welfare model for the prediction of consumption includes household size, we have run a robustness check excluding household size from the welfare model used for prediction. Te correlation between consumption per capita and household size is still signifcant (coefcient: −0.03, t-statistic: −2.17, *p*-value: 0.03).



*Note* simulated for Hargeisa, estimated for Mogadishu before imputation of non-assignment modules (normalized to 100%), and after imputing full consumption

**Fig. 4** Cumulative consumption distribution (in USD) per day and per capita (Color fgure online) (*Note* For core module (dark blue), core and assigned optional modules (medium blue), and imputed consumption (light blue). The presented consumption aggregate does not include consumption from durable goods

Te efect of a response bias due to this can neither be estimated from the simulations nor from the data collected in Mogadishu. However, an enhanced design with diferent optional modules varying in their comprehensiveness can shed light on this bias. Comparison between responses for the same item in a comprehensive and an incomprehensive list would indicate a lower bound for response bias. Assuming that a comprehensive list results in a better estimate, the response bias could be corrected.

Te rapid consumption methodology can increase the gap between capacity at enumerator level and the complexity of the survey instrument. Capacity at the enumerator level is often low in developing countries, especially in a fragile context. Te rapid consumption methodology increases the complexity of the questionnaire, which can further increase the gap between existing and required enumerator capacity. However, CAPI technology can seal of complexity from enumerators, as software can automatically create the consumption module based on core and optional modules for each household without showing the partition to the enumerator. In Mogadishu, advanced CAPI technology was used to automatically generate the questionnaire based on the assignment of the household to an optional module. While enumerators were made aware that diferent households would be asked about diferent items, administering the rapid consumption questionnaire did not require any additional training of enumerators beyond that needed for a standard consumption questionnaire.

Analysis of rapid consumption data requires high capacity. Analysis capacity is usually limited in developing countries, and especially in fragile contexts. While the general idea of optional consumption modules being assigned to households is digestible by local counterparts, poverty analysis based on a bootstrapped sample of consumption distribution is likely to overwhelm local capacity. However, even standard poverty analysis is often beyond the limits of local capacity in fragile countries. Terefore, capacity building usually focuses on data collection skills with a longer-term perspective on increasing data analysis capacity. In addition, the rapid consumption methodology might be the only way of creating poverty estimates in certain areas, for example, in Mogadishu.

Te results of the ex post simulation and the application of the methodology in Mogadishu suggest that the rapid consumption methodology is a promising approach to estimating consumption and poverty in a cost-efcient and fast manner, even in fragile areas.9 A similar ex post simulation for South Sudan and Kenya (data not shown) indicates that the rapid consumption methodology can also be applied at the country-level, with large intra-country consumption variation.10 Te rapid consumption methodology has been implemented in Somalia, South Sudan, and Kenya, with additional countries in the pipeline.

<sup>9</sup>Costs for implementing a rapid consumption survey are lower than conducting a full consumption survey due to the reduced face-to-face time needed, allowing enumerators to conduct more interviews per day.

<sup>10</sup>Ongoing feldwork is currently employing the rapid consumption methodology in South Sudan to update poverty numbers.

# **Annex**

Consumption of non-assigned optional modules can be estimated by diferent techniques. Tree classes, each with two techniques, are presented here, difering in their complexity and theoretical underpinnings. Te frst class of techniques uses summary statistics such as the average, to impute missing data. Te second class is based on multiple univariate regression models. Te third class uses multiple imputation techniques, taking into account the variation absorbed by the residual term.

## **Summary Statistics (Mean and Median)**

Tis class of techniques applies a summary statistic on the module-specifc consumption data collected and applies the result to the missing modules. Each household is assigned the same consumption per missing module. Here, the mean and the median are used as summary statistics. Te median has the advantage of being more robust against outliers but cannot capture small module-specifc consumption if more than half of the households have zero consumption for the module.

# **Module-Wise Regression (Ols and Tobit Regression)**

Module-wise estimation applies a separate regression model for each module. Tis allows for diferences in core consumption to be captured, as well as other household characteristics. Coefcients are estimated based only on the subsample assigned to the module under consideration. In general, a bootstrapping approach using the residual distribution could mimic multiple imputations, but this is not applied here. Given the impossibility of negative consumption, a Tobit regression with a lower bound of zero is used in addition to a standard OLS regression approach. For the OLS regression, negative imputed values are set to zero.

## **Multiple Imputation Chained Equations (Mice)**

Multiple Imputation Chained Equations (MICE) uses a regression model for each variable and allow missing values in the dependent and independent variables. As missing values are allowed in the independent variables, the consumption of all optional modules can be used as explanatory variables. As a frst step, missing values in the explanatory variables are drawn randomly. Tese values are substituted iteratively with imputed values drawn from the posterior distribution estimated from the regression. While the technique of chained equations cannot be theoretically shown to converge in distribution, the results in practice are encouraging, and the method is widely used.

## **Multivariate Normal Regression (MImvn)**

Multiple Imputation Multivariate Normal Regression uses an expectation-maximization (EM)-like algorithm to iteratively estimate model parameters and missing data. In contrast to chained equations, this technique is guaranteed to converge in distribution with the optimal values. An EM algorithm draws missing data from a prior (often non-informative) distribution and runs an OLS to estimate the coefcients. Te coefcients are iteratively updated based on reestimation using imputed values for missing data drawn from the posterior distribution of the model. MImvn employs a data-augmentation (DA) algorithm, which is similar to an EM algorithm, but updates parameters in a non-deterministic fashion, unlike the EM algorithm. Tus, coefcients are drawn from the parameter posterior distribution rather than chosen by likelihood maximization. Hence, the iterative process is a Markov chain Monte Carlo (MCMC) method in the parameter space, with convergence with the stationary distribution that averages the missing data. Te distribution for the missing data stabilizes at the exact distribution to be drawn from, to retrieve model estimates averaging over the missing value distribution. Te DA algorithm usually converges considerably faster than using standard EM algorithms.

### **Estimation Performance**

Te performance of the diferent estimation techniques is compared based on the relative bias (mean of the error distribution) and the relative standard error. We defne the relative error as the percentage diference between the estimated consumptionconsumption and the reference consumption (based on the full consumption module). Te relative bias is the average of the relative error. Te relative standard error is the standard deviation of the relative error. For estimations based on multiple imputations, the error is averaged over all imputations.

Each proposed estimation procedure is run on the random assignments of households to the optional modules. A constraint ensures that each optional module is assigned equally often to a household per enumeration. Te relative bias and the relative standard error are reported across all simulations.

Te performance measures can be calculated at diferent levels. At the household level, relative error is the relative diference in household consumption. At the cluster level, relative error is defned as the relative diference of the average reference household consumption and the average estimated household consumption across the households in the cluster. Similarly, the simulation level compares total average consumption for all households.

## **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **10**

# **Studying Sensitive Topics in Fragile Contexts**

**Mohammad Isaqzadeh, Saad Gulzar and Jacob Shapiro**

# **1 Motivation**

Fragility, confict, and violence (FCV) drastically undermines the efectiveness and efciency of providing public goods and services to the poor. FCV is moreover, a difcult feld to study because of the sensitivity and complexity of the nature of events to be addressed. To understand how confict and violence afect development programs and peoples' livelihood in fragile states requires assessing people's perception of the state, insurgent groups, international actors, and actions taken by these actors. Expressing views about these actors and their activities, however, are risky for those living in fragile states. People may fear that expressing their views could cost them potential benefts and that they

Princeton University, Princeton, NJ, USA e-mail: mri2@princeton.edu

S. Gulzar Stanford University, Stanford, CA, USA e-mail: gulzar@stanford.edu

M. Isaqzadeh (\*) · J. Shapiro

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_10

may incur threats by state and non-state actors, stigmatization, and social ostracism. As a result, questions on issues that are perceived to be sensitive can introduce sensitivity bias, that is, respondents may either avoid answering sensitive questions altogether or provide untruthful responses.

Sensitivity biases generally originate from one of four sources: self-image, taboo (intrusive topics), risk of disclosure, and social desirability.1 Self-image bias refers to untruthful replies based on misperceptions that individuals may have about themselves. Based on self-afrmation theory in psychology, individuals tend to maintain a perception of global integrity and moral adequacy and will reinterpret their own experience until their self-image is restored.2 Individuals may therefore provide untruthful answers to questions that relate to their integrity and morality because of their distorted self-image, rather than admit an intent to deceive others. Te second source of sensitivity bias is taboo or intrusive topics that respondents do not feel comfortable discussing with others. In such cases, non-response is more likely than untruthful answers as individuals try to avoid discussing the topic.3 Risk of disclosure is the third source of sensitivity bias. Here, respondents are reluctant to reply altogether or provide a truthful response fearing that their response could be disclosed to the government, rebel groups, criminal groups, or local power holders.4 Risk of disclosure, in the form of security threats by state and non-state actors or social sanctions by the community, is particularly relevant for research in an FCV context

<sup>1</sup>Our formulation here and in Sect. 2 draws heavily on Graeme Blair, Alexander Coppock, and Margaret Moor (2018), "When to Worry About Sensitivity Bias: Evidence from 500 List Experiments." Draft. Te authors conduct a thorough meta-analysis of more than 500 list experiments (technique explained below).

<sup>2</sup>Steele, Claude M., Steven J. Spencer, and Michael Lynch (1993), "Self-Image Resilience and Dissonance: Te Role of Afrmational Resources," *Journal of Personality and Social Psychology* 64 (6): 885–896; Liu, T. J., and G. M. Steele (1986), "Attribution as Self-Afrmation," *Journal of Personality and Social Psychology* 51: 351–340.

<sup>3</sup>Tourangeau, Roger, Lance J. Rips, and Kenneth Rasinski (2000), *Te Psychology of Survey Response*. Cambridge: Cambridge University Press.

<sup>4</sup>Blair et al. (2018).

where the expression of views on sensitive topics could be very costly for individuals.5

Finally, social scientists have long identifed social desirability, the fourth source of bias, as a common threat to the validity of research fndings.6 Social desirability refers to 'the tendency on behalf of the subjects to deny socially undesirable traits and to claim socially desirable ones, and the tendency to say things which place the speaker in a favorable light.'7 Social desirability usually refects a respondent's concern about favorable attitudes of a reference group. Te reference group could be peers, bystanders, family members or relatives present at the interview or even broader groups such as one's community or other communities, institutions, or individuals that consume the research fndings.8 An important reference group whose presence could introduce social desirability bias includes researchers and surveyors. In this case, social desirability is sometimes referred to as the 'experimenter demand efect.' In a study of anti-American sentiment in Pakistan, social desirability bias (social image) is found to potentially lead to the underestimation or overestimation of attitudes toward sensitive issues depending on whether those with extreme views conform to, and express views consistent with moderate respondents, and vice versa.9

Experimenter demand efects highlight that even if a survey or experiment is conducted in a private context where peer pressure is ruled

<sup>5</sup>Reminders of local insecurity reduce response rates on sensitive topics more than on other topics in a recent survey experiment in Somalia. Denny, Elaine, and Jesse Driscoll (2018), "Calling Mogadishu: How Reminders of Anarchy Bias Survey Participation," *Te Journal of Experimental Political Science*. For an early paper on this challenges of measurement see Bullock, Will, Kosuke Imai, and Jacob N. Shapiro (2011), "Statistical Analysis of Endorsement Experiments: Measuring Support for Militant Groups in Pakistan," *Political Analysis* 19: 363–384.

<sup>6</sup>Nederhof, Anton J. (1985), "Methods of Coping with Social Desirability Bias: A Review," *European Journal of Social Psychology* 15: 263–280; Rosenthal, Robert (1963), "On the Social Psychology of the Psychological Experiment: Te Experiment's Hypothesis as Unintended Determinant of Experimental Results," *American Scientist* 51: 268–283; and Rosenthal, Robert (1966), *Experimenter Efects in Behavioral Research*. New York: Appleton Century-Crofts.

<sup>7</sup>Nederhof (1985: 264).

<sup>8</sup>Blair et al. (2018) and Tajfel, Henri, and John C. Turner (1979), "An Integrative Teory of Intergroup Confict," *Te Social Psychology of Intergroup Relations* 33 (47): 74. 9Bursztyn et al. (2017).

out, the presence of a researcher alone could introduce bias and prevent respondents from expressing honest views and attitudes.10 In a randomized experiment, it was demonstrated that participants who did not vote in an election were 20 percentage points less likely to answer the door to participate in a survey when they had been previously informed through a fyer about the survey, relative to those who had not received a fyer.11 Te experiment shows the strength of stigma and shame that respondents may feel upon revealing that they did not vote to a surveyor, a stranger whom they may never interact with again.12

Social desirability bias may be even stronger in fragile contexts where social stigma could be costlier for individuals and where the association of surveys with aid and development projects could disincentivize truthful responses.

Regardless of the type, sensitivity bias can introduce two problems in surveys: item non-response and untruthful responses conditional on a response. In the case of item non-response, respondents take part in the survey but eschew answering sensitive questions, which is recorded as 'Don't Know' or 'Refused to Answer.' Item-non-response can lead to an underestimation of sensitive attitudes/behaviors and bias estimates of treatment efects when sensitivity is correlated with treatment status.<sup>13</sup> Untruthful reply conditional on a response refects cases where respondents do not avoid answering questions but provide deceitful replies. Both of these outcomes undermine research fndings. Considering the importance of studying sensitive attitudes, researchers have invested in developing approaches to eliminate or reduce sensitivity biases. Below, we discuss these approaches and highlight whether they address item non-response, untruthful reply conditional on response, or both.

<sup>10</sup>Rosenthal (1963, 1966).

<sup>11</sup>Dellavigna et al. (2016).

<sup>12</sup>Dellavigna, Stefano, John A. List, Ulrike Malmendier, and Gautam Rao (2016), "Voting to Tell others," *Te Review of Economic Studies* 84 (1): 143–181.

<sup>13</sup>For example, when estimating the correlation between receiving aid and support for militant groups one might worry that respondents in pro-militant communities are more reluctant to express support if they have gotten aid because they fear future aid will would be withheld. Tey therefore avoid the question at higher rates than those in other communities, leading one to erroneously conclude that receiving aid was negatively correlated with support for militants.

# **2 Approaches**

Researchers in the felds of psychology, economics, and political science have developed a range of approaches to studying sensitive attitudes, which can be very useful for conducting research and data collection in fragile contexts. Endorsement experiments, list experiment, and randomized response are the most commonly used techniques developed to mitigate sensitivity bias. Table 1 summarizes the three techniques, as well as direct questioning, with respect to their ability to mitigate diferent types of sensitivity biases.14 Te three techniques can clearly improve direct questioning by reducing non-response and bias due to risk of disclosure and social desirability. However, they are costly in terms of sample size (because they leverage statistical inference on


**Table 1** Survey approaches and addressing sensitivity biases

<sup>14</sup>We thank Graeme Blair for excellent advice on how to frame these issues.

the diference between two groups vs. using the mean in one group), require extensive pre-testing, and cannot address bias due to the intrusiveness of the topic (taboos) and self-image. In this section, we review the three approaches, their advantages, and limitations.15 At the end of the section, we will provide a brief overview of behavioral approaches to address sensitivity biases.

## **2.1 Endorsement Experiments**

Endorsement experiments aim to mitigate non-response and biases due to social desirability and risk of disclosure by obfuscating the object of study. Tey were frst used to study race relations in the US but were later used for studying support for states, international actors, and militant groups.<sup>16</sup>

Since questions about support for the state or insurgent groups in fragile states could pose safety issues for enumerators as well as respondents, answers to direct questions about the state or insurgents may not elicit honest answers and typically face high non-response rates. Te endorsement experiments overcome both issues by obfuscating the object of evaluation. When applied to measuring support for particular political actors, endorsement experiments seek respondents' views about particular policies, instead of asking the respondents to express views about particular groups or individuals. Researchers solicit views of actors by dividing respondents at random into treatment and control groups. In the control group, respondents are simply asked whether or not they support a particular policy. In the treatment group, respondents are asked the same questions but are reminded that the policy is endorsed by the groups or individuals who are the subject of the study. Tis approach is based on extensive research in social psychology, which

<sup>15</sup>For statistical software and several papers employing these methods, see Graeme Blair and Kosuke Imai's excellent website: http://sensitivequestions.org.

<sup>16</sup>Sniderman, Paul M., and Tomas Piazza (1993), *Te Scar of Race*. Boston: Harvard University Press; Blair, Graeme, C. Christine Fair, Neil Malhotra, and Jacob N. Shapiro (2012), "Poverty and Support for Militant Politics: Evidence from Pakistan," *American Journal of Political Science*.

show that individuals are more likely to favor policies that are endorsed by individuals from groups whom they like.17

As endorsement experiments avoid direct questioning about sensitive topics, respondents feel more comfortable answering questions, reducing non-response rates. Because this method provides a reasonable degree of plausible deniability, respondents are more likely to provide truthful replies, reducing bias due to risk of disclosure and social desirability. Tis method can potentially mitigate bias due to taboo (intrusive topics) if researchers can phrase questions in such a way that respondents do not feel that intrusive words are being associated with them. It cannot, however, mitigate biases due to self-image because it does not deal with misperceptions that individuals have about themselves.

In a study on support for Islamist militant groups in Pakistan, researchers included questions about support for the polio vaccination, among other policies.18 Te respondents in control group received the following message: 'Te World Health Organization recently announced a plan to introduce universal Polio vaccination across Pakistan. How much do you support such a policy?'

Te respondents in the treatment group were administered this slightly diferent statement and question, one which associated the policy with one of four militant groups active in the country at the time: 'Te World Health Organization recently announced a plan to introduce universal Polio vaccination across Pakistan. Pakistani militant groups fghting in Kashmir have voiced support for this program. How much do you support such a policy?'19

<sup>17</sup>Chaiken, S. (1980), "Heuristic Versus Systematic Information Processing and the Use of Source Versus Message Cues in Persuasion," *Journal of Personality and Social Psychology* 39 (5): 752–766; Petty, Richard E., John T. Cacioppo, and David Schumann (1983), "Central and Peripheral Routes to Advertising Efectiveness: Te Moderating Role of Involvement," *Journal of Consumer Research* 10 (2): 135–146; and Wood, Wendy, and Carl A. Kallgren (1988), "Communicator Attributes and Persuasion: Recipients' Access to Attitude-Relevant Information in Memory," *Personality and Social Psychology Bulletin* 14 (1): 172–182.

<sup>18</sup>Blair et al. (2012).

<sup>19</sup>Blair et al. (2012).

Compared to the direct questions about the militant groups in this study, the endorsement experiment questions received much lower non-response rates. For instance, while the non-response rate for direct questions ranged from 22% (questions about Al-Qaeda) to 6% (questions about the Kashmir Tanzeem), the non-response rate for endorsement experiments was much lower, ranging from 7.6 to 0.6%.

In addition to measuring sensitive attitudes, endorsement experiments can be utilized to study sensitive political behaviors as well. One study used an endorsement experiment to study voting 'no' on a personhood referendum in Mississippi.20 Tey administered two slightly diferent primes among the treatment and control group, as in the following box.


*Source* Rosenfeld et al. (2015)

By obfuscating the researcher's intention and object of evaluation, endorsement experiments are useful in reducing non-response bias and recovering estimates of sensitive attitudes. Ofcial results from an anti-abortion referendum in Mississippi in 2011 showed that while

<sup>20</sup>Rosenfeld, Bryn, Kosuke Imai, and Jacob N. Shapiro (2015), "An Empirical Validation Study of Popular Survey Methodologies for Sensitive Questions," *American Journal of Political Science*, 1–20.

direct questioning signifcantly underestimated the votes against the referendum (by close to 20% in most counties) and had signifcant non-response rates, the endorsement experiment and list experiment discussed below—reduced item non-response and removed approximately half the underestimate of 'no' votes. In contrast, randomized response methods—also discussed below—almost completely recovered the known vote shares.<sup>21</sup>

A number of studies have utilized endorsement experiments to study a range of sensitive topics, particularly support for the state and insurgents in fragile states.22 A useful resource on this topic is a comprehensive guide for, and illustration of, questioning strategy, regression methods, and analysis tools (including software package in R) for endorsement experiments.23

Te advantage of an endorsement experiment is that it obscures the object of the evaluation above and beyond concealing the respondent's answer to the sensitive question. Te main disadvantage is that a latent variable model is needed to estimate sensitive behavior and attitudes. In addition, the endorsement efect does not have an obvious scale, e.g. it is unclear a priori how a certain percentage change in support for a policy when it is associated with a group vs. not, would indicate supporting the group strongly to opposing it strongly on a standard Likert scale. Its estimates are also statistically inefcient (in the sense of requiring a larger sample to achieve a given confdence interval) compared to the other indirect methods discussed below.24

<sup>21</sup>Rosenfeld et al. (2015).

<sup>22</sup>See, for example: Lyall, Jason, Graeme Blair, and Kosuke Imai (2013), "Explaining Support for Combatants During Wartime: A Survey Experiment in Afghanistan." *American Political Science Review* 107 (4): 679–705; and Blair, Graeme, Jason Lyall, and Kosuke Imai, (2014), "Comparing and Combining List and Endorsement Experiments: Evidence from Afghanistan," *American Journal of Political Science* 58 (4): 1043–1063.

<sup>23</sup>Bullock et al. (2011), follow-on the work by Bullock et al. (2011). For the relevant software package in R and analysis tools, refer to http://endorse.sensitivequestions.org/. 24Rosenfeld et al. (2015).

### **2.2 List Experiments**

List experiments try to mitigate sensitivity biases by introducing uncertainty through aggregation. Tis method, also referred to as an 'item count technique' has been extensively used to study racial attitudes and prejudice as well as voter turnout and vote buying.25

Similar to the endorsement experiment, the sample is randomly divided into treatment and control groups. Both groups are asked to mention the total number of items on a list that they view as favorable or unfavorable (or number of actions they have taken), without identifying which specifc items are favorable or unfavorable. Te two groups receive similar lists except that the response options for the treatment group includes one additional item, the sensitive item which is the subject of the study.

As with endorsement experiments, list experiments can be used to study both sensitive attitudes and behavior.26 A list experiment to study vote buying in Nicaragua found that almost one quarter of voters were ofered gifts or services in exchange for votes while only 3% reported such activities when asked directly.27 Te following box shows the control and treatment statements used for assessing vote buying.

A regression analysis technique can be used to analyze list experiment data and recent work illustrates the application of the method

<sup>25</sup>Raghavarao, Damaraju, and Walter T. Federer (1979), "Block Total Response as an Alternative to the Randomized Response Method in Surveys," *Journal of the Royal Statistical Society, Series B (Statistical Methodology)* 41 (1): 40–45; Gonzalez-Ocantos, Ezequiel, Chad Kiewiet de Jonge, Carlos Mel´endez, Javier Osorio, and David W. Nickerson (2012), "Vote Buying and Social Desirability Bias: Experimental Evidence from Nicaragua," *American Journal of Political Science* 56: 202–217; Kuklinski, J., M. Cobb, and M. Gilens (1997), "Racial Attitudes and the 'New South,'" *Journal of Politics* 59 (2): 323–349; and Holbrook, A. L., and J. A. Krosnick (2010), "Social Desirability Bias in Voter Turnout Reports: Tests Using the Item Count Technique," *Public Opinion Quarterly* 74 (1): 37–67.

<sup>26</sup>For examples of research using list experiment to study racial attitudes see Kuklinski et al. (1997) and Kuklinski, J., P. Sniderman, K. Knight, T. Piazza, P. Tetlock, G. Lawrence, and B. Mellers (1997), "Racial Prejudice and Attitudes Toward Afrmative Action," *American Journal of Political Science* 41 (2): 402–419.

<sup>27</sup>Gonzalez-Ocantos et al. (2012).

investigating racial hatred in the US based on the 1991 National Race and Politics Survey.28 Tere is also a wide range of studies that have relied on list experiments for studying sensitive topics.29


*Source* Gonzalez-Ocantos et al. (2012)

Te advantage of list experiments is that respondents do not disclose whether the sensitive item applies to them. By concealing which items a respondent has favorable or unfavorable views about, the list experiment can reduce non-response rates and mitigate biases due to the risk of disclosure and social desirability. Since respondents do not actually reveal which items they agree or disagree with, this method could alleviate the respondents' fear of disclosing their views and their concerns about reference groups. By only expressing the number of favorable or unfavorable items, they can deny reference to the sensitive item. Tis method, however, cannot mitigate biases due to taboo since the intrusive

28Imai, Kosuke (2011), "Multivariate Regression Analysis for the Item Count Technique," *Journal of the American Statistical Association* 106 (494): 407–417. Te software package in R for analysis of list experiments can be obtained at http://list.sensitivequestions.org/. 29Blair et al. (2018).

words need to be mentioned either in the question or options. Tis method cannot reduce biases due to self-image either. Te main drawback of this approach is the problem of foor and ceiling efects. In the example above, if the respondent has experienced all the control items, then an honest response would no longer be obscure as it reveals that the respondent received a gift or favor in exchange for a vote, which is an example of the ceiling efect.30

In a comprehensive meta-analysis of list experiments applied to political attitudes and behaviors, the list experiment performs well, both in terms of recovering estimates consistent with direct questions about non-sensitive behaviors and in terms of reducing bias.31

## **2.3 Randomized Response**

Te randomized response approach is useful for estimating population-level variables by obscuring respondents' truthful answers through introducing noise in the responses.32 In this approach, respondents rely on a random outcome (such as fipping a coin) to add noise to the response, noise whose distribution the researcher knows, and can thus later remove from population-level summaries of the responses.

Randomized response questions come in two variants. In the disguised response version, the respondent is given two questions (an innocuous question and a sensitive question) and asked to fip a coin or other randomizing device out of sight of the surveyor. Te coin fip determines which of the two questions the respondent answers. In the forced response version, the respondent is asked to answer the sensitive question but the randomizing device can determine their answer, obfuscating each individual's answer. Te following box provides an illustration of these techniques.

<sup>30</sup>Rosenfeld et al. (2015) and Glynn, Adam N. (2013), "What We Can Learn With Statistical Truth Serum? Design and Analysis of the List Experiment," *Public Opinion Quarterly* 77: 159–172. 31Blair et al. (2018).

<sup>32</sup>Warner, Stanley L. (1965), "Randomized Response: A Survey Technique for Eliminating Evasive Answer Bias," *Journal of the American Statistical Association* 60 (309): 63–69.


*Source* Blair et al. (2015)

Although the randomized response approach has not been used as widely as the endorsement and list experiments because it is slightly harder to explain to respondents, it is an efective method for studying sensitive attitudes and behaviors in contexts where the population is familiar with some randomization device such as the dice.33 Te randomized response technique has been used to study social connections and contacts with members of armed groups in Nigeria, which was not only sensitive but could even pose security threats to the respondents and surveyors if inquired about directly. Tis method has been used for estimating a range of sensitive behaviors, from application faking to cheating and drug use.34 In the study on Nigeria, a multivariate regression analysis

<sup>33</sup>Blair, Graeme, Kosuke Imai, and Yang-Yang Zhou (2015), "Design and Analysis of the Randomized Response Technique," *Journal of the American Statistical Association* 110 (511): 1304–1319.

<sup>34</sup>Donovan, John J., Stephen A. Dwight, and Gregory M. Hurtz (2009), "An Assessment of the Prevalence, Severity, and Verifability of Entry-Level Applicant Faking Using the Randomized Response Technique," *Human Performance* 16 (1): 81–106; Scheers, N. J., and C. Mitchell Dayton (1987), "Improved Estimation of Academic Cheating Behavior Using the Randomized Response Technique," *Research in Higher Education* 26 (1): 61–69; Goodstadt, Michael S., and Valerie Gruson (2012), "Te Randomized Response Technique: A Test on Drug Use," *Journal of the American Statistical Association* 70 (352): 814–818; and Clark, Stephen J., and Robert A. Desharnais (1998), "Honest Answers to Embarrassing Questions: Detecting Cheating in the Randomized Response Model," *Psychological Methods* 3 (2): 160–168.

technique was used, and researchers provided guidance for power analysis and robust design for randomized response and illustration of applying this technique to their study of contacts with armed groups in Nigeria, in addition to a software package in R for data analysis.35,36

Validation studies of the randomized response approach have led to mixed results. A number of validation studies have found that the randomized response method leads to less biased estimates than direct questioning and reduces item non-response, although it is not always better than list experiments and endorsement experiments. In a validation of the Mississippi referendum on the 'Personhood Initiative', the authors found that randomized response outperformed other methods in terms of reducing bias.37 Compared to the actual referendum results, the bias in the weighted estimate of support for the referendum was only 0.04 in the randomized response while it was 0.236 in the direct question, 0.149 in the list experiment and 0.069 in the endorsement experiment. However, this method was not the best in reducing the non-response rate. Although the non-response rate in the randomized experiment (13%) was lower than the direct question method (20%), it was much higher than the non-response rate on the list experiment (2%) and the endorsement experiment (0.003%).

Te main disadvantage of a randomized response approach is that it requires respondents to administer randomization, which can lead to high rates of item non-response and even survey and attrition. Furthermore, using randomizing devices or fipping coins may be culturally inappropriate in some contexts. A number of validation studies report high rates of non-response and less valid estimates for randomized response approach than a list experiment although other studies have found more favorable results and smaller non-response rates.38

<sup>35</sup>Blair, Graeme, Kosuke Imai, and Yang-Yang Zhou (2015), "Design and Analysis of the Randomized Response Technique," *Journal of the American Statistical Association* 110 (511): 1304–1319.

<sup>36</sup>Te software package in R can be obtained at http://rr.sensitivequestions.org/.

<sup>37</sup>Rosenfeld et al. (2015).

<sup>38</sup>For the discussion of advantages and disadvantages of randomized response, see Rosenfeld et al. (2015).

### **2.4 Behavioral Approaches**

Behavioral approaches mitigate sensitivity bias through direct observation of behaviors that reveal preferences without direct inquiry about those preferences. Two common approaches to measuring behavior are dictator games (where the participants are asked to decide whether they want to share money with another participant) or 'ofer' experiments where the respondents decide whether or not to accept an amount of money. Te strength of these approaches is in their indirect measurement of sensitive attitudes and high degree of obfuscating the objective of the research.

Behavioral approaches have been used in studying a range of attitudes and behaviors, such as discrimination and xenophobia, altruism and prosocial behavior, religious beliefs, and anti-American attitudes.39 For instance, one study uses fnancial costs to indirectly study anti-American identity in Pakistan.40 Study participants were given Pakistani Rupees (Rs.) 100 or 500, when the daily wage of a manual laborer is between Rs. 400 and 500, merely for checking a box to thank the donor. As shown in the box below, in one version of the instrument, the donor was local (the Lahore University of Management Science) while in the second version it was foreign (the US government).

<sup>39</sup>Studies of discrimination and xenophobia include Becker, Gary S. (1957), *Te Economics of Discrimination*. Chicago: University of Chicago Press; Bursztyn, Leonardo, Georgy Egorov, and Stefano Fiorin (2017), "From Extreme to Mainstream: How Social Norms Unravel," NBER Working Paper No. 23415, May 2017; Rao, Gautam (2013), "Familiarity Does Not Breed Contempt: Diversity, Discrimination and Generosity in Delhi Schools," Working Paper, https:// scholar.harvard.edu/rao/publications/familiarity-does-not-breed-contempt-diversity-discrimination-and-generosity-delhi. For altruism and prosocial behavior, see Anderoni, James (1990), "Impure Altruism and Donations to Public Goods: A Teory of Warm-Glow," *Economic Journal* 100: 464– 477; DellaVigna, Stefano, John A. List, and Ulrike Malmendier (2012), "Testing for Altruism and Social Pressure in Charitable Giving," *Quarterly Journal of Economics* 127 (1): 1–56; and Ariely, Dan, Anat Bracha, and Stephan Meier (2009), "Doing Good or Doing Well? Image Motivation and Monetary Incentives in Behaving Prosocially," *American Economic Review* 99 (1): 544–555. For studies using monetary ofers to study religiosity, see Augenblick, Ned, Jesse M. Cunha, Ernesto Dal B'o, and Justin M. Rao (2012), "Te Economics of Faith: Using an Apocalyptic Prophecy to Elicit Religious Beliefs in the Field," NBER Working Paper No. 18641, December 2012; Condra, Luke N., Mohammad Isaqzadeh, and Sera Linardi (2017), "Clerics and Scriptures: Experimentally Disentangling the Infuence of Religious in Afghanistan," *British Journal of Political Science*, 1–19. 40Bursztyn et al. (2017).


*Source* Bursztyn et al. (2012)

Te study in Pakistan found that when participants make decision privately and if the source of the funds is the US government, almost one quarter of them forgo the money, Rs. 100.41 However, when they expect their decision to be public, a signifcantly smaller proportion (around 10%) rejects the payment. Tey conclude that since the participants expect the majority to accept the payment from the US government, a substantial number of them (15%) conform to the majority and accept the payment although they would not in private. When the payment is increased to Rs. 500, the rejection rate falls from 25%, but a signifcant proportion of the participants (10%) still forgo the payment.

# **3 Practical Issues**

In addition to being useful tools in recovering truthful responses, the indirect methods reviewed in this chapter have a number of practical advantages over direct questioning. First, they help reduce survey staf vulnerability, which might be particularly important in confict settings. By masking the nature of the question itself, survey staf are more likely

<sup>41</sup>Bursztyn et al. (2017).

to be protected when local authorities do not allow sensitive questions being to be asked, despite legal protection. Tere is also the added beneft that plausible deniability may protect individuals by not revealing their true response at the individual level in case the survey instruments are compromised. Tese issues typically do not arise in non-confict settings but can be particularly important when protecting individual responses is critically important.

Although the indirect methods for studying sensitive topics outperform direct questioning in many settings, they also have limitations. First, the indirect methods add noise to the estimates, which means that for any given level of statistical power, much larger samples are required to measure group-level diferences.42 Although scholars have proposed ways to reduce noise and remedy the problem of large samples in some cases (such as using double lists or negatively correlated items in a list experiment), the requirement of a large sample remains an important drawback of these indirect methods.43 Second, these methods require much more extensive pre-testing and preparation than direct questions, which would increase the costs (both fnancial and human resources) for studying the same topics and could afect the research timeline as well. Tird, although these methods reduce sensitivity bias, they cannot overcome incentive compatibility issues. Tese methods may not provide incentives for the respondents to reveal their true views and attitudes even if they are assured that their individual views will not be disclosed. In essence, these methods reduce the cost of expressing views as long as respondents are interested in expressing their views. If the respondents see advantages in concealing their views and attitudes, these methods do not provide them with incentives to express their views. Some of the behavioral approaches overcome this problem by imposing costs on the

<sup>42</sup>Blair et al. (2018) show that most prior list experiments have been underpowered and recommend using direct questions for all but the most sensitive questions unless large samples can be obtained.

<sup>43</sup>For discussion of how to address ceiling efect and reduce noise in list experiments see Glynn (2013).

subjects if they do not reveal their preferences, but the three indirect methods do not impose such costs.44

Te most important lesson learned from the studies that have utilized indirect methods, however, is the signifcance of pre-testing. Endorsement experiments require fnding political issues on which the groups in question would plausibly take a stand for and that all relate to the same latent policy dimension. Properly implementing list experiments requires choosing control items so that foor and ceiling efects are avoided for almost all respondents. And randomized response requires fnding a culturally appropriate randomization device and choosing the appropriate type of question. In short, all indirect methods require much more pre-testing of questions and instruments than traditional direct question do in order to ensure that they can recover truthful replies in which researchers are interested.

Given the cultural and contextual diversity of FCV contexts, some of these methods may work in some contexts but not in others. It is very important to select the appropriate method taking into consideration the concerns and context where the research is conducted. Finally, if feasible, researchers should consider validating the fndings of indirect methods by comparing them with available census data or social media data whenever available.

## **References**


<sup>44</sup>In Burstyn et al. (2017), for instance, the subjects are imposed costs (forgoing payments from the U.S. government) for expressing anti-American identity. Game theory and "ofer" experiments use fnancial incentives to study altruism.


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **11**

# **Eliciting Accurate Consumption Responses from Vulnerable Populations**

**Lennart Kaplan, Utz Pape and James Walsh**

# **1 The Data Demand and Challenge**

Accurate data on the key economic variables afecting people who have been forcibly displaced, such as consumption and assets, is essential to understanding their situation and to developing evidence-based policies to support them. Poor information or data inaccuracies can lead to fawed diagnostics and impact assessments, resulting in inefcient use and a waste of limited resources. In the context of displacement, consumption

L. Kaplan

University of Göttingen, Göttingen, Germany

Heidelberg University, Heidelberg, Germany

U. Pape (\*) · J. Walsh World Bank, Washington, DC, USA e-mail: upape@worldbank.org

### J. Walsh

e-mail: james.walsh@nufeld.ox.ac.uk

**193**

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_11

data is particularly important because malnutrition is rife and mortality rates are high, and interventions using consumption data are needed to support the immediate basic needs of vulnerable populations.

In previous High Frequency Survey (HFS) survey rounds, approximately 45% of Somali Internally Displaced Persons (IDP) households reported food consumption below subsistence levels, and 80%, below recommended levels. It is no surprise that IDP populations report lower consumption levels. IDPs face signifcant hardship that hinders their potential for generating adequate livelihoods, such as experiencing the loss of a breadwinner, not having any productive assets, or having fallen victim to violence. Indeed, IDPs have much less control over their own livelihoods, employment opportunities are scarce within camps, and a large part of their consumption is provided for through aid by NGOs and international organizations.

Yet, there are also reasons that indicate that the low levels of consumption might be due, at least in part, to misreporting. First, very low levels of consumption are associated with high rates of mortality due to starvation. Te observed mortality rates among IDPs, however, does not indicate that mortality increased due to starvation across the country at such a scale.1 Second, non-IDP households that are statistically similar on observable characteristics report higher levels of consumption than IDP households. While IDPs and non-IDPs may have diferent opportunities to generate income, it is unlikely that IDPs do not smooth their resources to balance food and non-food consumption in a way that endangers their life. Te vulnerability of the population increases the stakes for getting the data right: for policymakers designing programs to support IDPs, spurious data is either unusable or biased.

Te potential for surveys to generate information that is systematically biased is well documented. A large body of research focuses on improving the accuracy of self-reported information collected in household surveys.2 In the context of IDPs, that respondents feel compelled

<sup>1</sup>Although data from the USAID led Famine Early Warning Systems Network (FEWS NET) suggest high level of malnutrition, evidence on mortality across the counties is mixed (FEWS NET 2018).

<sup>2</sup>Tere are a number of mechanisms through which the validity of self-reported information in surveys can be compromised. Some inaccuracies result from cognitive biases—for example, acquiescence or "yea-saying" (Bachman and O'Malley 1984; Hurd 1999), extreme responding

to misreport is particularly relevant. Indeed, survey respondents in IDP camps may believe that their responses will infuence the provision of humanitarian aid and will thus misreport consumption in an attempt to infuence its distribution. If survey respondents are underreporting, the inaccuracies generated in the data are highly problematic. At best, it makes the data spurious and unusable. At worst, it could lead to misallocations of aid, from more vulnerable areas to less vulnerable areas, or from solutions emphasizing sustainability to immediate relief when immediate relief is unnecessary. Given this context, light touch adaptations to the design of the survey that prime the idea of honesty ofer to make big improvements to the quality of the data and support provisions the data informs.3

## **2 The Implementation**

Te experiment included 4145 IDP and 781 non-IDP households across South Sudan in 2017 rolled out in mid to late 2017. To investigate whether consumption might be underreported by IDP populations, households were randomly exposed to a bundle of 'honesty primes.' Te treatment had three components, which were simultaneously administered in one treatment arm (Fig. 1). Tese included an emphasis on the importance of accurate answers at the beginning of the survey, a short fctional scenario, which required passing judgment on the behavior of one of the characters, and additional questions to

<sup>(</sup>Cronbach 1946; Hamilton 1968), and question order bias (Sigelman 1981). Other inaccuracies emerge from conscious but not calculated behavior. Respondents may deliberately misreport information on sensitive subjects not to distort statistics but to maintain their reputation or to abide by political norms (Gilens et al. 1998; Rosenfeld et al. 2016). Some misreporting is purposeful. Individuals may misreport in a calculated fashion to increase earnings in a study context (Mazar et al. 2008) or to shape the results of the study if they believe that it will inform policy. It is not surprising that this problem might arise in the context of development aid, an area rife with perverse incentives (Bräutigam and Knack 2004; Cilliers et al. 2015).

<sup>3</sup>Tis chapter is a summary of Kaplan, Pape, and Walsh (2018, forthcoming), "Eliciting Accurate Responses to Consumption Questions Among IDPs in South Sudan Using "Honesty Primes", Policy Research Working Paper Series. Te World Bank.

**Fig. 1** Treatment Components (*Source* Authors' visualization)

determine the household's last meal, asking respondents to explicitly report whether or not they have eaten in the last week.4,5 While the former two targets intentional misreporting, the latter addressed classical measurement error.6 Te bundle of primes addressed diferent psychological mechanisms:


<sup>4</sup>Mazar and Ariely (2006).

<sup>5</sup>One example of this is when individuals' beliefs regarding the consequences of lying afects their behavior. In a two-person experiment where one participant can increase her payof by lying but at the expense to her counterpart, Gneezy (2005) fnds that individuals' propensity to lie is sensitive to the costs it imposes on the other person.

<sup>6</sup>Rasinski et al. (2005) and Vinski and Watter (2012).

<sup>7</sup>Talwar et al. (2015).

answer truthfully to sustain self-consistency. People make decisions on the basis of both external and internal reward systems: even when people have a material incentive to lie, their internal drive to protect their self-integrity may override.8,9

3. Investigative probing: Tis places a higher salience on the importance of getting answer to the question right. By asking for broader categories frst, subsequent sub-categories are put under more scrutiny. Self-consistency is reinforced by relating to a longer recall period of seven days.

It is important to note that the treatment is not designed to directly elicit increases in reported consumption. Rather, the intention is to bring the importance of honesty into focus during the interview. It is only through this mechanism—increases in honesty—that we should expect to indirectly see increases in consumption. Tus, ex-ante, we should not expect the treatment efects to be uniform across the consumption distribution.

Almost one-third of respondents (30.1%) reported a calorie intake below the daily subsistence level of 1200 kcal per day and the median per capita consumption was below the recommended calorie intake (1589 kcal per day). Conditioning on adult equivalents, the median shifted well above the recommended daily intake. However, a substantial part of the distribution, 16%, still reported being below the subsistence level and 40% reported being below the recommended daily intake.10 As with the number of consumption items, the graph indicates that there was a slight shift in the reported consumption among the treated, with respect to very low consumption levels.

<sup>8</sup>Mazar and Ariely (2006).

<sup>9</sup>One example of this is when individuals' beliefs regarding the consequences of lying afects their behavior. In an two-person experiment where one participant can increase her payof by lying but at the expense to her counterpart, Gneezy (2005) fnds that individuals' propensity to lie is sensitive to the costs it imposes on the other person.

<sup>10</sup>Several respondents report overly high consumption levels, which far surpass conventional levels (>4000 kcal per day). Robustness checks take this issue into account by censoring the data at the extremes.

Diferent dependent variables are specifed because they have diferent implications for the respondent's scope of infuence on their value. Te impact of the 'honesty primes' on the total consumption value, both in terms of money and food intake, is of primary interest. Yet, they are second-order values that are calculated as a function of other variables, including consumption quantities and calories or prices that are in turn defated. Tese variables are difcult for respondents to falsify because of the intense mental computation required. Te consumption quantity in kilograms is a more direct measure of the quantity consumed as expressed by the respondent and may lead to more accurate estimation of the impact of the 'honesty primes.' Finally, counting the number of items may lead to an even more accurate measure, since the variable is not cleaned and is taken at face value. Furthermore, omitting an item is the easiest and quickest way for respondents to reduce the value of the household's consumption.11

# **3 Key Results**

Tere is a small diference in reported consumption on average between the treatment and control group. Te consumption levels shown in Fig. 2 shows a slight diference in consumption between IDP households in the treatment and control groups, though this is apparent only at lower levels of consumption, below SSP 400. In contrast, the distribution of consumption across the two groups matches much more closely for the non-IDP population. Te distribution of the number of items displays a similar pattern, though the efect is also faint (Fig. 3). Again, a diference is not visible in the non-IDP population. Te number of observations for the non-IDP population is much lower than for the IDP population, and the variance of the distribution is expected to be much greater.

If respondents are deliberately misreporting, those misreporters are likely to be doing so at low consumption levels (e.g., it is more likely to

<sup>11</sup>Note that the number of consumption items is not reported at a per-capita level as it does not increase proportionally with household size.

**Fig. 2** Consumption distribution by population and treatment (*Source* Authors' calculations using HFS 2017, IDPCSS 2017 and CRS 2017)

**Fig. 3** Number of items consumed by population and treatment (*Source* Authors' calculations using HFS 2017, IDPCSS 2017 and CRS 2017)

be the case that a small number of respondents are signifcantly underreporting, rather than a large number of people underreporting by a just a little bit). Given the treatment is not designed to increase reported consumption levels per se, but rather to invoke honesty, it should afect only those people who are misreporting. Hence, heterogenous treatment efects across diferent household consumption levels (quantiles) test the validity of 'honesty primes.'12 Figure 4 depicts priming efects across diferent consumption levels for the four outcome measures of interest.13 Te priming signifcantly increases reported consumption among lower consumption levels, but not for medium and higher consumption levels. Signifcant treatment efects mainly infuence the reported number of consumption items and the quantities in kilograms. Monetary and caloric consumption measures are not as strongly afected. Te latter might also be less susceptible to deliberate misreporting as they depend in part on variables over which the respondent has no control (calories per item; defators).

Te priming has stronger efects among the more vulnerable IDPs. Te non-IDP subsample is used to assess the robustness of our main results as we would expect a less signifcant priming efect among the non-IDPs. Results in Fig. 5 indicate less signifcant efects, corresponding to the hypothesis that 'honesty primes' are more efective among more vulnerable IDPs.14 Tis corresponds to adverse/perverse incentives in foreign assistance settings. Specifcally, when IDPs are exposed more intensively to development aid, they may more likely signal their 'neediness' or provide socially desirable answers to signal their 'worthiness' for assistance.

Four dichotomous indicators are used to assess whether the priming shifts a signifcant share of respondents above certain reporting

<sup>12</sup>One might be concerned that honesty primes afect the consumption level of households and, thus, shift the household to another comparison group. Due to the theoretical expectation that treatment efects occur at lower levels of household consumption and are 'light-touch', treatment and control group should still be comparable.

<sup>13</sup>Figure 4 provides a band of the statistical 95% confdence interval of the estimate. Tus, if the confdence band does not cross zero, there would be a 5% chance of indicating signifcant efects, while the 'true' efect would be zero.

<sup>14</sup>For example, Cilliers et al. (2015) or Bräutigam and Knack (2004).

**Fig. 4** Treatment effects across quintiles (IDPs) (*Source* Authors' calculations using HFS 2017, IDPCSS 2017 and CRS 2017. All regressions use clustered robust standard errors [White 1980]. Confdence bands refer to the 95% confdence interval. Consumption quantities, values, and calories are used in per-adult equivalent terms. The regression framework is introduced in the appendix. No sampling weights are used as 'honesty primes' are expected to affect, specifcally, the extremes of the distribution and the average treatment effect is not a priori of interest)

thresholds. Te indicators are equal to one if (i) the respondent household surpasses the caloric subsistence level of 1200 kcal or (ii) the recommended level of caloric intake of 2100 kcal. Two further dummies are created at (iii) 66.66% and (iv) 100% of a normalized poverty line, which is scaled by the fact that only core consumption items were assessed consistently across all surveys. Although the coefcients are mostly positive, only two coefcients turn signifcant in columns (2) and (3) (Table 1). Te results stress the positive efect of the primes, where seven percent more respondent households would have reported above the recommended daily calorie intake level. However, only certain population strata are afected.

**Fig. 5** Treatment effects across quintiles (non-IDPs) (*Source* Authors' calculations using HFS 2017, IDPCSS 2017 and CRS 2017. All regressions use clustered robust standard errors [White 1980]. Confdence bands refer to the 95% confdence interval. Consumption quantities, values, and calories are used in peradult equivalent terms. The regression framework is introduced in the appendix. No sampling weights are used as 'honesty primes' are expected to affect, specifically, the extremes of the distribution and the average treatment effect is not a priori of interest)


**Table 1** Results using poverty thresholds

*Source* Authors' calculations using HFS 2017, IDPCSS 2017 and CRS 2017 Robust standard errors in parentheses: \**p*<0.1, \*\**p*<0.05, \*\*\**p*<0.01

## **4 Lessons Learned and Next Steps**

Most measures to increase the accuracy of surveys assume that respondents want to report as accurately as possible. In many cases, this assumption is incorrect. Tis research ofers novel and suggestive evidence that increasing the salience of honesty may increase survey accuracy, even if incentives to misreport exist. We fnd signifcant treatment efects for respondents most likely to be underreporting (those at lower levels), but no signifcant efects for those at higher levels who are unlikely to be underreporting. We fnd that the efects are stronger for outcome measures that can easily be manipulated (the number of consumption items) than for those that cannot easily be manipulated (the monetary consumption quantities).

Te study underlying this chapter has two main limitations. First, while the experimental set-up allows for identifying a clean treatment efect, it can only compare the control group against an estimate of the 'true' rates of consumption. Without more objective data it is not possible to dismiss the possibility that the higher consumption levels reported in the treatment group are not true and subject to overreporting. Te mortality rates among IDPs suggest that starvation is not occurring systematically across the country, but the precarious situation calls for further scrutiny.15 Before adjusting poverty estimates, a thorough comparison with more 'objective' data from administrative, anthropometric, or observational sources is needed. Second, the intervention is bundled. For this reason, it is impossible to isolate the causal mechanism afecting the observed changes in reporting. However, if classical measurement error would be afected, treatment efects of the primes should be uniform. In contrast, heterogenous efects across quantiles suggest that the targeting of intentional misreporting via the appeal to honesty and moral prime would be the driver of our results. More research, which unbundles these primes in diferent treatment arms or combines them with other survey tools can contribute to developing more durable solutions for data collection. Due to both the low

<sup>15</sup>FEWS NET (2018).

costs in terms of money and survey time, the 'honesty primes' constitute a valuable supplement for surveys in contexts, where incentives for underreporting exist. Beyond fragile states, the primes could be also a possible survey extension if aid reliance is high (e.g., in Mali or Malawi) as indicated by our subsample analysis.

# **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Part III**

**Other Innovations**

# **12**

# **Using Video Testimonials to Give a Voice to the Poor**

**Utz Pape**

# **1 The Data Demand and Challenge**

South Sudan is a country with a very tumultuous recent history, witnessing more than its share of crises since 2013. Te collapse of a fragile peace accord in 2016 led to a renewed military confrontation, while international oil prices simultaneously dropped, depriving South Sudan of its main source of foreign exchange. Tis triggered a severe fscal and economic crisis, causing prices to skyrocket, and making many market products unafordable for the majority of South Sudanese. Tus, securing livelihoods has become more and more difcult, with 66% of the population, a record high, living in poverty. While this number summarizes the country's poverty level, which is important for comparability and analyses to inform policies and programs, the number does not reveal the daily struggles that families face.

U. Pape (\*)

World Bank, Washington, DC, USA e-mail: upape@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_12

Te collection of household data is usually a passive process where respondents are asked pre-formulated questions. Tis constrains the respondents in sharing their own narratives and emphasizing what they feel is important. Giving a voice to the poor beyond being an anonymous and abstract data point is not only helpful to better understand the concerns of the poor, but also to empower them to create a narrative that they own. While some social programs include activities to empower the poor by giving them a voice, the implementation of household surveys is an opportunity that is often missed, in terms of using direct contact with the population across a country to transform a one-sided narrative into one that empowers the poor.

# **2 The Innovation**

To empower the poor and bring humanity to an abstract poverty-related number, we decided to collect short, voluntary video testimonials from people living in South Sudan as part of the High Frequency South Sudan Survey. Te High Frequency Survey conducts household interviews in urban and rural areas in South Sudan. Te survey is used to collect consumption data in order to estimate poverty, and to measure other socio-economic indicators. As the data is collected using tablets, we decided to utilize the full capability of the tablets by recording voluntary videos after the structured interview if the respondent consented. Te video testimonials were subsequently edited, English subtitles were added as translations to the local languages, and noise flters were used to enhance audio quality. Te video testimonials were then categorized into themes such as poverty and livelihoods or security and displacement, and were published on the dedicated website www.thepulseofsouthsudan.com.

# **3 Key Results**

Te testimonials captured the dire situation in South Sudan, revealing what it is like to live in poverty. Tey were shown as part of workshops and conferences as well as available on a website. While abstract data may help the government fne-tune its policies, the videos depict the sense of powerlessness, the pain of hunger, and the feelings of hopelessness and disappointment that characterize people's experiences. Te testimonials capture the struggle of parents watching their children starve, not being able to provide for them or send them to school, and knowing that tomorrow will not be a better day.

Te opportunity for the poor to voice their struggles is a frst step toward empowerment, allowing them to share their lives with the world. Te testimonials can also serve to inspire policymakers to continue fnding innovative ways to help the respondents and millions of others like them to escape poverty. While there is no substitute for quantitative analysis in designing programs and policies, such video testimonials are an efective tool to raise awareness about the concerns of the poorest. Tey make it clear that poverty is not just a number but a human struggle.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

We started collecting video testimonials in a pilot, without providing specifc training or additional equipment to the enumerators. When we watched these testimonials, we quickly realized that some training was essential. While the videos often started by recording the faces of the respondents, the camera usually moved downwards after a few seconds and ended up recording only their feet or the dust on the ground. Loud wind or other noises sometimes drowned out the voices of the respondents.

To improve the quality of the recordings, we collaborated with journalists and documentary producers to design a one-day training for the enumerators. Te training was used to introduce two pieces of very inexpensive but essential equipment: A tripod was necessary to ensure that the camera remained steady and focused on the respondent; and a microphone that could be clipped to the shirt of the respondent ensured that the voice would be audible. Te training also included professional guidance on asking open-ended questions to initiate the video testimonial. Te success of the training was evidenced in the remarkable quality of video testimonials that were recorded after the training.

During the feldwork period, there was a decline in the number and the quality of video testimonials. Naturally, enumerators were exposed to various pressures, and were required to conduct as many interviews of sufcient quality as possible. To create more space for video testimonials and to focus on the quality of videos, we introduced monetary incentives for enumerators recording the most and the best video testimonials. Te enumerators welcomed this competition with each other, and we saw an increase in the number and the quality of testimonials.

Te World Bank's inaugural fagship report, Poverty, and Shared Prosperity 2016: Taking on Inequality, raised concerns about addressing prevalent data gaps in measuring poverty. Te World Bank has therefore pledged to ensure that the 78 poorest nations have household-level surveys every three years. To date, 41 of 48 Sub-Saharan African countries have surveys ongoing or planned over the next two years. Tese surveys also represent an opportunity to give more voice to the poor. Our experience in South Sudan shows that recording testimonials is an extremely low-cost intervention when implemented in conjunction with a household survey. In fact, the additional costs in South Sudan were below US\$50k—a small percentage of the overall survey costs. Giving a voice to the poor brings us one step closer to achieving our goals of ending extreme poverty and boosting shared prosperity by 2030.

Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of theInternational Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **13**

# **Iterative Benefciary Monitoring of Donor Projects**

**Johannes Hoogeveen and Andre-Marie Taptué**

# **1 Introduction**

Mali is a sparsely populated, predominantly desert country with an undiversifed economy. It is particularly vulnerable to commodity price fuctuations (gold is a major export), and to the consequences of climate change. Mali has a population of 15 million, 10% of whom are living in the three northern regions of Gao, Kidal, and Timbuktu. High population growth rates, low agricultural productivity, and weather shocks fuel food insecurity, poverty, and instability. Te delivery of services within this large territory is challenging and afects geographic equity and social cohesion.

Mali's political and security situation became volatile in 2012 when the northern regions were occupied by rebel and criminal groups who

J. Hoogeveen (\*) · A.-M. Taptué

World Bank, Washington, DC, USA

e-mail: jhoogeveen@worldbank.org

A.-M. Taptué e-mail: ataptue@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_13

threatened to take over the country in a coup. Tese events led to a coup and the deployment of French-led military forces in January 2013. In July 2013, the United Nations Multidimensional Integrated Stabilization Mission in Mali (MINUSMA) took over security measures from the French forces. Constitutional order was restored when tworound presidential elections were held in July and August 2013, with a turnout of 49 and 46% of eligible voters, respectively.

A Peace Accord between the government and two rebel coalitions, known as the "Platform" and "Coordination" groups, was signed by the government and the Platform group on 15 May 2015, and by the government and the Coordination group on 20 June 2015. However, its implementation remains challenging. Security, which is critical to ensuring economic recovery and poverty reduction, remains fragile, with continuing attacks on the UN forces and the Malian army by jihadist groups in the north. Tere are also attacks on civilians in Bamako, the most recent of which targeted the Radisson Blu Hotel in November 2015, the Nord-Sud Azalai Hotel in March 2016, and a holiday resort near Bamako in June 2017.

Following the presidential elections, a Mali donor conference was organized in Belgium. At the conference, the international community confrmed its continued support, and aid fows, which had declined following the coup, resumed. Following the conference, development partners including the World Bank started to prepare new projects, many focusing on the still insecure northern part of the country. With this refreshed engagement came an increased commitment to project performance.

Information on project implementation is typically captured by project monitoring systems. Tese monitoring systems track progress but are also expected to fag potential shortcomings or problems. In practice, most monitoring systems do not act as independent rapporteurs, but focus on producing progress indicators for midterm and fnal reviews. Even this reduced role is not always well-executed and reports often come too late to help projects improve. Supervision missions ofer another source of information on project performance, but there is a limit to the information such missions obtain. After all, why show a team of visiting project supervisors an activity that is facing problems?

Less biased information about the efectiveness of projects comes from evaluations by non-project staf. Typically, these take the form of randomized control trials, or large-scale surveys, such as the Service Delivery Indicator (SDI) Surveys, which measure the quality of service delivery in health and education, or Public Expenditure Tracking Surveys (PETS). Te challenge of these data-intensive approaches is not their reliability, but that they are expensive and therefore not able to be repeated frequently. Moreover, they are time-consuming and rarely deliver quick results; sometimes, results only become available after the project has fnished.

## **2 The Innovation**

For project managers who want to use monitoring data, information obtained through iterative feedback loops is to be preferred over data from infrequent surveys. After all, if the aim is to improve outcomes, it is important not only to establish what a project's problems are, but also to act to address them and to assess whether the action resolved the issue. Te idea behind an iterative feedback loop is to allow a project team to learn lessons from a project's shortcomings and improve its performance. Once action has been taken, one must assess whether the identifed defciencies have been resolved. To allow for regular feedback, data collection should be afordable and focused. Reliable, regular, and inexpensive data are the ideal (see also Box 1).

To meet these requirements, a benefciary feedback system was designed that is light and low-cost, focused on a select set of issues, and implemented by an independent entity with no stake in the outcomes of the project. Tis approach has been labeled: Iterative Benefciary Monitoring or IBM. By keeping data collection focused (few research questions and small samples), IBM facilitates timely data analysis and the rapid preparation of reports. By keeping data collection costs down, frequent data collection becomes feasible. Te IBM approach refects a major diference from more typical monitoring systems that collect the bulk of their information at the beginning, in the middle, and at the end of the project. Te approach fts within the thinking on adaptive project design as well as complexity, approaches to project design and implementation that stress the importance of context, collecting feedback and demonstrating fexibility in design and implementation.1

#### **Box 1 Benefciary monitoring is not a new concept, but light monitoring is**

IBM is not the frst time projects systematically seek feedback from benefciaries during project implementation. A 2002 social development paper presented lessons learned from Benefciary Assessments that aimed to amplify the voice of the people for whom development is intended. In the report, Benefciary Assessment is presented as a tool for managers who wish to improve the quality of development operations. The approach, which is rarely used today, has been applied to over 300 projects in 60 countries; it is qualitative, and relies on a combination of direct observation, conversational interviews, and participant observation.

This qualitative approach differs from IBM in important ways. IBM samples tend to be much smaller, its reports shorter, more factual, and produced within weeks of data collection. The cost of the qualitative approach is also much higher. Where IBM costs never more than \$5000 per round of data collection, the average cost of qualitative Benefciary Assessments was \$40,000 per round of data collection. For these reasons the qualitative approach is less suited to serve as an iterative feedback loop that is repeated regularly.

*Source* L. F. Salmen (2002).

How does IBM work in practice? An iterative feedback loop begins with gaining intimate knowledge of a project. Tis implies discussions with the project manager and those responsible for project implementation (such as the Project Implementation Unit) to establish trust and to identify issues in need of investigation.2 Project staf are in an excellent position to refect on the factors that may be hampering successful project implementation.

<sup>1</sup>Andrews et al. (2012) and Bowman et al. (2015).

<sup>2</sup>Agreeing to an iterative feedback system at the project design stage is another way to facilitate collaboration between project monitors and project implementers. Nobody questions the need for fnancial audits, and the same should hold for iterative monitoring. It is difcult to oppose the development of such a system at the design stage, when everyone is working to design a project that delivers the best possible results.

Core project documents need to be read, starting with the Project Appraisal Document (i.e. the document describing the project, its objectives, and modes of implementation). Te Implementation Manual is another important document because it describes how the project is expected to operate in practice. It can also be invaluable for identifying sources of information or standards that can be used to assess the project. Supervision reports, aide memoires, and mission reports help to identify issues of potential concern. Project familiarization is time-consuming and, in itself, an iterative process. It is indispensable if an efective approach to data collection is to be designed, and because it builds trust with the project staf, laying the groundwork for follow-up once results have been produced.

Collecting information from benefciaries and others at the front-line of service provision (such as staf working in schools, clinics, or farmers' organizations) is at the heart of the iterative feedback approach. Teir experience with the project is what ultimately matters. IBM thus focuses on obtaining direct feedback from these benefciaries. Identifying what information to obtain from whom is an important step in the design of a feedback system. For instance, in a project ofering meals to students, the perspective of parents and guardians is critical because they can ascertain that children have eaten. Students can give their views on the quantity and quality of the food and how often they receive it. Head-teachers can confrm whether the money to buy the food arrives on time, Parent Teacher Associations can explain whether procedures are being followed, and those who prepare the food are wellplaced to report whether the money they receive is sufcient.

It is thus critical that the iterative system is developed in close collaboration with project managers. Tey need to provide access to project fles (including benefciary databases needed for sampling) and to validate the methodology and instruments for data collection. If this is not carefully done, project managers may eventually contest the validity of the results, and little follow-up can be expected. While the monitoring team will need to collaborate closely with project management, the team will also need to ensure that the identity of respondents and the locations where data are collected are kept confdential. If this is not done, there is a risk that the results will be biased.

It is important to keep the data collection exercise light, and to resist the temptation to collect more information than is strictly necessary. A project manager's capacity is often constrained, and a project team can only handle so many issues at a time. Given that the approach is iterative, new issues can be addressed in subsequent rounds of data collection and not all issues need not be investigated in the frst iteration. Tis gives the project team the option to prioritize what is most critical or most easily addressed. By keeping the data collection exercise light, the design of data collection instruments is relatively straightforward. Nonetheless, validation of the data collection instruments by project management remains an essential step. Tis includes pre-testing in a real-life setting and discussing the instruments with key project staf to assure that the right issues are captured in an appropriate way (Fig. 1).

Te design phase of the iterative approach is typically the most time-consuming phase, and hence, the most resource intensive. Rapport must be built with project staf and analysts need to familiarize themselves with the details of the project and develop, discuss, and test data-collection instruments and approaches. In comparison, data collection itself is relatively inexpensive. Te "golden rule" of IBM is that each round of data collection should cost less than \$5000. Tis is an arbitrary number which is kept deliberately small to force IBM designers to focus on key issues and afordable samples. Given this cost structure, the iterative feedback loop difers fundamentally from typical survey exercises, where data collection is the costliest part of the process. Keeping data

**Fig. 1** Five steps of the IBM approach

collection costs low is of primordial importance to the success of IBM, because in its absence, frequent data collection would not be afordable and its iterative character lost.

Data are typically collected by enumerators specifcally hired and trained for the task. Data can be collected using face-to-face interviews, but due to the high transportation costs of survey data collection, samples need to be kept to a minimum. Tis need not be a problem. When project-related issues are widespread, or when standards or deadlines must be met (as set out in the Implementation or Operations Manual), a small number of deviations may pinpoint a problem. Irrespective of sample size, attention needs to be paid to the sample design to ensure that the results are representative; this implies identifying a database from which the sample can be drawn. Tis is usually not a problem, as most projects maintain a database of benefciaries. Additional decisions may also have to be made to keep costs down, but these should always be discussed with project staf, to ensure that such decisions are acceptable. For instance, it may be proposed to sample only from one small geographic area. Tis might be acceptable, for instance, if this area refects an upper bound, meaning that the efects of any of the project's shortcomings are likely to be worse in other areas. For example, if it takes a long time to transfer money to schools close to the capital, then it is plausible to assume that the situation is worse in more remote areas.

Figure 2 illustrates a case in Tanzania, generated as a precursor to IBM by one of the authors. It shows how a small number of water kiosks (24 observations), drawn randomly from a database of all water kiosks, already shows that ofcial tarifs set by the regulator are ignored.

Technology can be used to enhance efciency and reduce cost. If projects collect phone numbers of benefciaries, information can be collected rapidly and in a cost-efective manner by enumerators who call benefciaries on their mobile phones (see Chapters 2 and 3 on data collection using mobile phone interviews). Tis allows for larger samples while remaining within the \$5000 data collection budget and is particularly important in a context of insecurity, or when the population may be hostile to authorities and their activities. Mobile phone-based data collection is also a solution when benefciaries are mobile, as is the case

*WƌŝĐĞƐĐŚĂƌŐĞĚĂƚǀĂƌŝŽƵƐǁĂƚĞƌŬŝŽƐŬƐŝŶĂƌĞƐ^ĂůĂĂŵ*

**Fig. 2** Small samples may suffce to uncover problems (*Source* Uwazi 2010)

for displaced populations or nomads (Chapter 4). Because collecting data over the phone is inexpensive, collecting phone numbers of benefciaries simplifes the creation of an iterative feedback loop.

### **Box 2 How IBM compares to project monitoring**

Iterative benefciary monitoring is an agile, inexpensive way to obtain feedback on project implementation. IBM can be considered a complement to project monitoring in the following ways:

First, while traditional project monitoring is used to continuously assess overall implementation progress and tends to produce voluminous progress reports at fxed points in time, IBM is demand-driven, produces short reports, can be repeated as often as is needed and is focused on diagnosing specifc barriers to effective implementation.

Second, project monitoring provides progress reports to the project manager, while IBM reports to the person responsible for the project in the donor organization. IBM thus functions as an independent check on project monitoring systems, much in the same way that fnancial audits serve as an independent check on companies' regular fnancial reports. Within the World Bank, IBM is carried out by non-project staff, who do not bear responsibility for supervising the project. Though IBM has never been applied in this manner, it could be viewed as means to assess the ability of an MIS system to identify pertinent issues. By engaging non-project staff, project teams tend to beneft from a fresh perspective that helps teams improve, even in well-established projects.

Third, relative to a feld supervision mission by the project lead, IBM is project supervision "on steroids" as IBM obtains feedback from a much larger sample of benefciaries than could possibly be covered by a supervision mission visiting two or three project sites. When IBM goes to project sites, it typically visits some 20–30 sites. When benefciaries are interviewed by phone, sample sizes lie between a couple of hundred and one thousand. IBM also collects data from randomly selected activities, hence avoiding selection bias.

Once collected, data are analyzed and ofered as feedback to project managers and project leaders. Given that the dataset is kept small, analysis is rapid. IBM reports are specifc, factual and short, and typically less than ten pages. As reports are likely to reveal a project's shortcomings, care is taken to ensure the highest standards of accuracy. Where World Bank projects are concerned, management is copied as a matter of procedure. Often, results will also be discussed with those responsible for the project in the client government. Tese authorities may request that the project team take the steps required to address the issues but rarely is this needed as project teams tend to be responsive to IBM fndings. Another round of data collection will follow sometime later (generally after a few months), with the aim of measuring improvements and, to assess whether new issues may have arisen. Te reporting process is the same as for the earlier round. Tis cycle is repeated on a regular basis until the end of the project.

Reports remain internal, intended for use by the client government, project managers, and supervisors. Disclosing negative facts publicly could have unintended negative consequences, and as is not an objective of IBM.3 Te experience with water price monitoring (as shown in Fig. 2) is illustrative in this regard. Light monitoring principles were applied, but instead of working to address the issue with the regulator, those in charge of the monitoring process sought media attention. Public pressure and parliamentary questions led to corrective action,

<sup>3</sup>See also J. Hoogeveen and N. Nguyen (2017).

but these were of an ad hoc and symbolic in nature. Certain responses even aggravated the situation, as some water kiosks were closed because they had been overcharging, leaving those dependent on water kiosks with fewer options than they had previously. After the initial media interest, there was no systematic follow-up, and overcharging continued unabated.

# **3 Key Results**

Te IBM approach was frst introduced in Mali, ofering feedback to an education project (school feeding), an agriculture project (electronic subsidies or e-vouchers), a social protection project (cash transfer), and also to activities managed by the Malian Authorities such as the provision of health insurance to the extreme poor and the functionality of newly established land commissions. In the case of school feeding, the project supervisor expressed concern that only part of the money allocated to this activity was being used. To explore this issue, a clear division of tasks was agreed: the team member from the Poverty Practice of the World Bank would take charge of all issues related to data collection and reporting, while the supervisor from the Education Practice of the Bank would facilitate all interactions with the Ministry of Education and the Project Implementation Unit. Te collaboration was smooth, and after some introductory and follow-up meetings, the National Centre of School Canteens at the Ministry of National Education shared the database of schools benefting from the school feeding program. Tis database was used to draw a sample of benefciary schools. To assure ownership and accuracy, ofcials from the Ministry and the Centre actively participated in the preparation and validation of the survey methodology and tools but were not provided the list of schools included in the sample.

Te frst round collected data in 20 randomly selected schools. Two enumerators were trained and traveled to each of the schools to carry out face-to-face interviews with head teachers, managers of school canteens, and a subsample of parents. It cost less than US\$5000 to complete the data collection exercise, and the report took little time to prepare, as information had only been collected on a limited set of issues. Ofcials

**Fig. 3** Regular follow-up improved school feeding performance (*Source* Authors' calculations based on IBM data)

from the National Centre of School Canteens were informed about the main results together with the project manager. Results were shared with the Country Director and the Minister of National Education.

Results showed that it took more than four months to transfer money from the Ministry of National Education to schools. Consequently, much of the money for school feeding arrived after the school year had started, jeopardizing one of the objectives of the program, namely increasing enrolment rates. Moreover, the amount of money sent to schools was insufcient to feed all students during the envisaged period, and some schools were forced to ofer food less than fve days per week, reducing the incentive for students to remain in school (Fig. 3).

Transfers were expected every quarter, but their real frequency was lower. Also, procedures as described in the operations manual were not followed exactly. Amounts transferred were supposed to refect enrolment rates for instance, but often they deviated and were much higher or lower than they should have been.

Te IBM report was discussed with the project staf, and the Minister of National Education, who followed-up by sending letters to project ofcials demanding improvements. Additional supervision missions were initiated, and school enrolment information was updated to ensure the correct amounts were transferred.

Six months later, a second round of data was collected, this time in 30 schools randomly selected from a list that excluded the schools that have been interviewed in the frst round. Results showed it now took much less time for money to arrive at the schools. Most schools received close to the exact amount that was expected, and all money that was disbursed by the Ministry arrived in the schools. Te second report thus showed signifcant improvements in project implementation, through certain issues persisted (Table 1).

Te success of the use of this data collection approach in the education sector aroused interest from other project supervisors. Te approach was then introduced to an agriculture project that distributed subsidies in the insecure north of the country using electronic vouchers (e-vouchers). E-voucher benefciaries had been registered and their phone numbers and core characteristics captured in a database. Tis information was used to send them vouchers by text message. Upon receipt of their vouchers, benefciaries could buy specifc products, typically fertilizers and livestock products, at designated retail locations at a discount.

Project management expressed concern about the limited uptake of the subsidies. A supervision mission had reported that during the frst wave only a fraction of the benefciaries who had been sent an e-voucher had collected their products, even when they were free of charge. Te suggestion was that there might be problems with the distribution system, or that there was a lack of interest among the benefciaries in the products on ofer. Identifying the exact nature of the problems was clearly important for the success of the project.

Because the project had a database with phone numbers of its benefciaries, and as the areas of intervention were insecure, the team opted to use telephone interviews for data collection. Project management shared its database and participated in working sessions to validate the methodology and survey instruments and to select a representative sample of 100 benefciaries who were interviewed by phone. Inspection of the shared database revealed the presence of many duplicate phone numbers, allocated to diferent people in diferent villages. While the procedural manual permits diferent benefciaries to use the same phone number, as not everyone owns a phone, they would be expected to live in the same village. However, the duplicates identifed in the database were not in the same location. After four attempts to call a respondent, only 40% had been reached, raising questions about network coverage


**Table1** Iterative feedback approach for a school feeding project in Mali

in villages where benefciaries live, the accuracy of the phone numbers in the database, and/or the location of benefciaries, as some people might have left their initial locations due to insecurity.

Te initial results showed that all the benefciaries who had received e-vouchers had collected their products, suggesting that the low uptake of the products was not for a lack of interest. As a signifcant proportion of benefciaries could not be reached by phone, it was not possible to know whether all the e-vouchers had been successfully delivered. It seemed plausible that, like the failed telephone interviews, many e-vouchers had failed to reach their intended benefciaries, suggesting a communication problem between the e-voucher platform and the benefciaries. Finally, many benefciaries indicated not having received the full quantity of (free) products indicated on their vouchers. Nor had they been compensated for any items not received.

Following these results, the Bank's team contacted the project and telecom providers to discuss the fndings and to address certain issues, including the number of duplicate phone numbers in the database, the inability to send a high number of text messages per second, and the absence of a "text message received" message.

A second round of data collection was carried out fve months later. Te sample was increased, as there was a need to assess whether the approach was working and how well it worked, as the successful implementation of the e-voucher scheme was a condition for a budget support operation to the government of Mali. More information was needed than a simple understanding of whether the approach was working, and evidence had to be collected on the percentage of benefciaries in diferent districts, and the application of targeting criteria. Te second round showed that the management of the system had improved. Te database was cleaner, more respondents could be reached, more messages could be sent per second, and receipt messages were now received. However, the results also showed that the roll-out of the scheme still left much to be desired. Not all the agreed zones were covered, and e-vouchers had been sent late, three months after the start of the agricultural season. Moreover, e-vouchers were distributed for fertilizers that could not be used given the stage of the growing season. Finally, fertilizer suppliers turned out to have been selected using


**2**Iterative feedback for a project distributing agricultural inputs using electronic vouchers

**Table**

**Fig. 4** Selected gender outcomes uncovered by different IBM activities (*Source* Hoogeveen et al. 2018)

a non-competitive method. Tese fndings led to high-level discussions between Work Bank management and the Malian authorities (Table 2).

IBM, because it collects evidence directly from benefciaries, has proved to be efective at monitoring gender outcomes of projects. In a number of instances, pertinent and concerning gender biases were uncovered. Benefciaries of a cash transfer program turned out to be mostly men, as were the benefciaries of the e-voucher program. Land commissions lacked almost any female members (Fig. 4). Te adverse gender results uncovered by IBM were not the consequence of bad intentions. Projects were often designed with gender in mind, and in some instances, even employed gender specialists. Invariably, project staf responded positively to the fndings when they received them and corrective actions followed. In the latest iteration of IBM, approaches to asking sensitive questions (discussed in Chapter 11) are used to assess from project benefciaries whether Gender Based Violence might be in issue. Particularly for infrastructure projects in fragile or remote settings this is at times a concern.

# **4 Implementation Challenges, Lessons Learned, and Next Steps**

IBM's iterative feedback approach is relatively straightforward, but applying it successfully requires care. Build a good rapport with a project team is critical, and nobody likes to receive negative feedback, although this is precisely what an iterative feedback system often does. Confdentiality, good relations with project staf and the government, and agreement on the shared objectives of the monitoring process are essential. Once it is evident that the objectives of the IBM team are aligned with those of the people responsible for project implementation, reticence typically disappears.

Integration of an iterative monitoring approach at the project design stage has the beneft of being able to identify possibilities for benefciary monitoring early on. Small changes in project design or in the procedural manual can greatly facilitate iterative monitoring. For instance, it makes a diference when procedural manuals stipulate that phone numbers and core characteristics of benefciaries need to be captured in an electronic database that can be accessed for sampling and (anonymized) monitoring. It also makes a major diference when a procedural manual stipulates that certain benefts need to be distributed by a certain date, as this then ofers a clear point in time when progress toward project objectives can be measured.

Even if an iterative monitoring approach is only designed during the project implementation phase, ways can be found to make follow-up monitoring easier. Registering the phone numbers of respondents in faceto-face interviews allows for easy follow-up. Indeed, during each round of the school feeding IBM exercise, phone numbers of respondents (canteen managers, head teachers, and households) were collected for future follow up. Sometimes feedback is ofered spontaneously, with benefciaries volunteering information to the project team, often by text message, about instances when the money for school feeding was exhausted before the expected date, about whether or not the money arrived on time, or about other issues afecting the functioning of the canteen. When such information is received and deemed relevant, the project team can use the phone numbers of other benefciaries to verify whether what has been reported is a unique case, or an indicator of a more generalized problem.4

<sup>4</sup>Note that the iterative approach difers from approaches in which benefciaries are given the opportunity to register complaints. Complaints fag issues, but are not able to distinguish between idiosyncratic negative experiences and the presence of more general project failures. For the latter, feedback needs to be collected in a structured manner.

Another issue for consideration is who should conduct the monitoring. Unsatisfactory results with existing monitoring systems suggest that much is to be said for monitoring by an independent third party. In Mali, staf from the Poverty Practice were responsible for data collection, while staf from the Education respectively Agricultural Practices who were responsible for project implementation, facilitated dialogue with project staf. Working with staf from the Poverty Practice had major advantages. Its micro-economists are experienced in sampling, designing instruments for data collection, training enumerators, and executing primary data collection activities, as well as in data analysis and reporting. Moreover, its staf is familiar with prevailing operating procedures but does not bear responsibility for the success or failure of a project. Tis facilitates giving independent, unfltered feedback.

Local presence is another important element for success. Presence facilitates building trust with the project teams and an understanding of how the project operates, and makes it much easier to have discussions about results and corrective actions. Presence close to the location of implementation also increases responsiveness, which is important when issues need to be identifed and addressed quickly: after all, lost days cannot be made up, missed meals cannot be replaced, and agricultural inputs distributed late are of little use to farmers.

Familiarity with project procedures and staf facilitates the design of an iterative loop, and as such, outsourcing the approach in the same way as fnancial audits are outsourced is likely to be a challenge. An intermediate approach, however, could work. Design of instruments and reporting could be left to staf familiar with household survey design and analysis, and dialogue with the client left to those responsible for the project, while data collection could be outsourced. Such an institutional set-up underscores the respective responsibilities of the recipient government, those responsible for project implementation, for project supervision, and for ofering benefciary feedback. It assures a separation of roles which helps avoid reporting bias.

## **References**


Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **14**

# **Concluding Remarks: Data Collection in FCV Environments**

**Johannes Hoogeveen and Utz Pape**

Environments characterized by fragility, confict, and violence (FCV) are very heterogenous, comprising countries as diferent as Togo and Tuvalu, but also Syria or Chad. Despite this heterogeneity, there are major commonalities. All fragile countries are characterized by limited administrative capacities, country situations are volatile and uncertain, and there is a high degree of data deprivation. Many, but not all, fragile countries are afected by violence. When considering how to address urgent data gaps in fragile countries, the potential for data collectors to get exposed to violence is a defning feature.

*In non*-*violent, fragile countries*, eforts should be made to strengthen capacities by rebuilding and strengthening existing statistical systems. As capacities are limited, care should be taken not to overload strained systems with major reform eforts or overly ambitious statistics

#### U. Pape

e-mail: upape@worldbank.org

J. Hoogeveen (\*) · U. Pape

World Bank, Washington, DC, USA

e-mail: jhoogeveen@worldbank.org

<sup>©</sup> International Bank for Reconstruction and Development/Te World Bank 2020 J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8\_14

production programs. Certain areas may have to be prioritized, such as the creation of up-to-date sampling frames, as high volatility—often observed in pre- or post-crisis countries—outdates existing sampling frames more rapidly than in a normal context. Given the cost and logistics to update sampling frames with traditional methods, Chapters 7 and 8 ofer alternative approaches that could be followed to bridge the gap until a traditional population or enterprise census can take place.

As non-violent, fragile countries are prone to volatility, strengthening the capacity to collect data during times of crisis is recommended. Te Rapid Response Surveys discussed in Chapter 3 are especially relevant and could be pursued as part of a more comprehensive crisis readiness approach. Te creation of a mobile phone survey team along with the systematic collection of phone numbers of potential respondents, and the preparation of draft phone questionnaires that could be used, are small investments that would yield enormous benefts in terms of information availability during times of distress. Other measures to protect the integrity of the statistical system may also be considered, such as ensuring greater redundancy in the storage of data and reports, including by storing electronic copies of-site or in a cloud.

*For fragile countries in which violence is likely*, a business as usual approach is neither realistic nor desirable. Te monetary as well as opportunity cost of collecting data, whether expressed in fnancial terms, risk, or use of scarce capacity is much higher in violent settings, and so it is critical to consider whether the envisaged benefts of producing the data are worth the price. Higher cost invariably means less data collection, so trade-ofs need to be made. Complex household surveys, suited for non-violent situations, are rarely the instrument of choice in situations of violence. At times complex surveys can be simplifed as is discussed in Chapter 9 using the rapid consumption methodology, but these approaches are technically challenging and for this reason only suited for low capacity environments if complemented with well-trained technical assistance.

When making choices on what to collect, it is important to realize that even in violent situations, many variables remain relatively unchanged over time. Collecting information on such slow-changing aspects should be less of a priority. Other aspects change rapidly in violent FCV environments. Insecurity and deteriorated infrastructure enhance volatility as markets become thinner. As a consequence, food insecurity is more easily at risk. Knowing perceptions, opinions and grievances of citizens is critical as they drive their expectations of the authorities and behavior, including support for local armed groups. So one should frst seek answers to questions like: How do prices of key food items change? What is happening to wages, to food security? How are citizen perceptions evolving? How are displaced people cared for? Are interventions succeeding? Tese aspects should be monitored over time, before moving to more complex surveys.

Tis suggests that relative to non-violent settings, data collection programs in violent situations should be even more agile. Te focus should be on updating information regularly and uncovering trends as opposed to collecting data that gives very precise information about levels. It is more important to know that food security is rapidly worsening than to know what exactly the percentage of food insecure people. Tis has implications for the way data collection systems are set up. Lighter surveys, or mobile phone surveys should be the standard tools for data collection in FCV settings. Lighter surveys have the advantage that they can be implemented more rapidly, require less capacity for training and analysis. And once call centers have been set up, and phone numbers of diferent (potential) target groups have been collected, they can be used repeatedly.

With this book we hope to have pointed practitioners to relevant alternatives which can help meet critical data needs, even in the most diffcult circumstances. Mobile phone surveys, discussed in Chapters 2–5 give a favor of the possibilities. When mobile phone surveys are not an option and face-to-face interviews need to be conducted, alternatives can be found by relying on resident enumerators (discussed in Chapter 5), or by designing light data collection instruments like the commune census discussed in Chapter 6. When topics are narrower, for instance whether interventions are succeeding, then iterative benefciary monitoring (IBM) (Chapter 13) ofers an approach that can be followed. When sensitive questions need to be asked or when one is afraid responses might be biased, Chapters 10 and 11 ofer pointers.

By sharing these innovations, we hope that many more people can beneft from them, joining us in our attempts to reduce data deprivation and, more importantly, extreme poverty. Tis book is prepared with practitioners in mind, and when necessary, we focused on showcasing examples rather than elaborating technical details of the approaches. We realize that the proposed approaches vary in complexity, time intensity and cost. Some require a high level of technical expertise at the design stage, others are expensive or difcult to implement. Table 1 may serve as a guide on the kind of expertise that is needed to apply the different approaches discussed in this book.

We welcome feedback and enquiries and are happy to explain in greater depth the methods used and the approaches taken. Contact details for the authors can be found in the section on contributors.



Te opinions expressed in this chapter are those of the author(s) and do not necessarily refect the views of the International Bank for Reconstruction and Development/Te World Bank, its Board of Directors, or the countries they represent.

**Open Access** Tis chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/Te World Bank, provide a link to the Creative Commons license and indicate if changes were made.

Any dispute related to the use of the works of the International Bank for Reconstruction and Development/Te World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. Te use of the International Bank for Reconstruction and Development/Te World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/Te World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/Te World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.

Te images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Index**

**A** adaptive questionnaire design 46, 47

**C**

Computer-Assisted Personal Interviews (CAPI) 10, 70–72, 157, 162, 166 confict 2, 5, 6, 34, 36, 38, 41, 63, 64, 66, 74, 84–86, 130, 142, 153, 173, 175, 185, 188 consumption 4, 5, 8, 10, 23, 26, 27, 65, 92, 94, 119, 123, 154– 168, 170, 193–195, 197–201, 203, 210, 236 consumption module 7, 10, 157–159, 163, 164, 166, 167, 170

**D**

displaced 3, 5, 6, 8, 52, 53, 55, 56, 58, 66, 84, 85, 87, 88, 95, 129, 131–133, 142, 193, 222, 237 displaced people 52–58, 60 district census 4, 86–88, 90, 91, 93, 95, 96

**E** endorsement experiment 175, 177–182, 186, 190 enumeration area 8, 66, 67, 69, 72, 104–106, 118, 133, 134, 137, 141, 142, 144

**F**

fragile 2, 3, 5, 6, 10, 153, 154, 166, 167, 173, 176, 178, 181, 204, 209, 216, 230, 235, 236

© International Bank for Reconstruction and Development/Te World Bank 2020 **241** J. Hoogeveen and U. Pape (eds.), *Data Collection in Fragile States*, https://doi.org/10.1007/978-3-030-25120-8

fragility 2, 5, 34, 155, 173 Fragility, confict, and violence (FCV) 1, 2, 5, 6, 8, 94, 173, 174, 190, 235–237

**G** geospatial 142 GIS 112, 116, 124, 125, 137

#### **H**

humanitarian 33, 34, 131, 142, 195

#### **I**

insecurity 4–6, 37–39, 41, 43, 66, 68, 72, 73, 80, 84, 85, 105, 139, 175, 215, 221, 228, 237 Internally displaced people (IDP) 6, 58, 95, 105, 106, 134, 141, 163, 194, 195, 198 Iterative Benefciary Monitoring (IBM) 4, 7, 10, 217–225, 230, 231, 237

#### **L**

list experiment 10, 174, 177, 181–186, 189, 190 listing 8, 93, 104, 109–112, 118, 123, 125, 133, 134, 137–141 Local Development Index (LDI) 8, 87, 89, 92, 93, 95–98

#### **M**

mobile phone 4, 5, 7, 8, 10, 11, 16–19, 21–24, 36, 45, 47, 48, 54, 60, 61, 67, 87–90, 96, 221, 236, 237 mobile population 67, 113 monitoring 8, 16, 18, 20, 22, 24, 36, 41, 48, 61, 64, 73–77, 81, 84, 87, 92, 153, 216–219, 230–232

#### **P**

pastoralist 112 phone interview 4, 17, 19, 22, 55, 67, 221 phone survey 18, 19, 21, 23, 35, 45, 47, 48, 237 plausible deniability 179, 189

#### **Q** qualitative 218

#### **R**

randomized response 177, 181, 182, 184–186, 190 Rapid Emergency Response Survey (RERS) 11, 35, 36, 38 refugee 53, 55, 59, 106, 130–135, 138, 139, 141, 143 resident enumerator 8, 11, 66–68, 77, 237 risk 5, 22, 33–35, 39, 56, 77, 93, 95, 153, 164, 174, 177, 178, 183, 219, 236, 237

**S**

sampling 7, 8, 36, 87, 92, 112, 116, 119, 122, 123, 125, 131–134, 137–141, 147, 201, 202, 219, 231, 232 sampling frame 4–6, 8, 35, 36, 45, 47, 65, 85, 104, 105, 109, 131, 134, 141, 142, 236 survey 5, 6, 8, 10, 11, 17, 19, 20, 22, 23, 34–37, 41–48, 53–55, 57, 58, 60, 65–69, 71–73, 76, 80, 85, 86, 92–95, 106, 109, 112, 113, 116–120, 124, 131, 132, 134, 139, 140, 142, 143, 154, 155, 157, 163, 166, 167, 174–177, 180, 184, 186, 188, 195, 203, 204, 212, 220, 224, 226, 232

**U**

United Nationals High Commissioner for Refugees (UNHCR) 6, 129, 130, 133–135

**V**

video 8, 10, 11, 18, 210–212 violence 3, 5, 34, 72, 77, 83, 84, 86, 142, 173, 194, 230, 235, 236

#### **W**

World Bank 1–3, 16–19, 34, 70, 76, 77, 84, 112, 118, 125, 142, 195, 212, 216, 222, 223

World Food Programme (WFP) 48, 94

#### **T**

tablet 11, 72 testimonial 212 tracking survey 59, 61